Updates_LiteratureReviews (05)

Title:  Safeguard against Unicode Attacks: Generation and Applications of UC-SimList

Authors:    Anthony Y. FuCity University of Hong Kong, Hong Kong

Wan ZhangCity University of Hong Kong, Hong Kong

Xiaotie DengCity University of Hong Kong, Hong Kong

Liu WenyinCity University of Hong Kong, Hong Kong

Reference: Click to Open Journal


In my previous updates, I talked about a video I created on the project. Well, this week, I returned my focus on reviewing literally works. The work of interest I decided to read talks about Unicode Attacks. Unicode Attack is basically trying to encode characters in URL to escape application filtering.  The journal created ways in which this kind of attack is prevented by developing an application to handle to job. Below I will explain the strategy these researchers used to implement their ideas.


I liked the ideas that went into the project. Their goal was to try to prevent Unicode attacks by creating an API package that supports their work. Unfortunately, Accessing the API is difficult. The Link to the download the API is: http://antiphishing.cs.cityu.edu.hk/. The Server seems to be down or simply I cannot access it.

The design of the project is to compare the string of text entered with Universal code Characters Set (UCS). What the team did was create two sets of Unicode Similarity List (UC-SimList). The two sets of UC-SimList were 1) UC-SimList_s: this list has the characters from the original UCS, and 2) UC-SimList_v: this list, according to the researchers, is created manually.  Each pair of characters given is multiplying by their visual similarity and their semantic similarity. They also measure the characters in 2D  kernel densities using the sample points on its contour.


Due to the fact that I wasn’t able to see the API package, I cannot tell which aspects of their project needs improving. The only question that I needed answer was How this package is used. A semi-Improvement is to provide more information about the project. I felt the information provided was limited. Some of the key concepts introduced were not clear or never defined. Ideas like Universal code Characters Set (UCS) was not clearly defined.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s