Title: Safeguard against Unicode Attacks: Generation and Applications of UC-SimList
Reference: Click to Open Journal
In my previous updates, I talked about a video I created on the project. Well, this week, I returned my focus on reviewing literally works. The work of interest I decided to read talks about Unicode Attacks. Unicode Attack is basically trying to encode characters in URL to escape application filtering. The journal created ways in which this kind of attack is prevented by developing an application to handle to job. Below I will explain the strategy these researchers used to implement their ideas.
I liked the ideas that went into the project. Their goal was to try to prevent Unicode attacks by creating an API package that supports their work. Unfortunately, Accessing the API is difficult. The Link to the download the API is: http://antiphishing.cs.cityu.edu.hk/. The Server seems to be down or simply I cannot access it.
The design of the project is to compare the string of text entered with Universal code Characters Set (UCS). What the team did was create two sets of Unicode Similarity List (UC-SimList). The two sets of UC-SimList were 1) UC-SimList_s: this list has the characters from the original UCS, and 2) UC-SimList_v: this list, according to the researchers, is created manually. Each pair of characters given is multiplying by their visual similarity and their semantic similarity. They also measure the characters in 2D kernel densities using the sample points on its contour.
Due to the fact that I wasn’t able to see the API package, I cannot tell which aspects of their project needs improving. The only question that I needed answer was How this package is used. A semi-Improvement is to provide more information about the project. I felt the information provided was limited. Some of the key concepts introduced were not clear or never defined. Ideas like Universal code Characters Set (UCS) was not clearly defined.