The PERSONAL INFORMATION PROTECTION ACT is a law to protect the freedom and rights of individuals, and it aims to actualize the individual dignity and value of people. According to the act, personal information is defined as pieces of information that can easily identify an individual when coupled with other pieces of information, and phone numbers are seen as one of the main types of personal information. This post explains the PUP (Potentially Unwanted Program) that collects phone numbers.
Figure 1 shows a PUP program with the filename, “OOO Joonggonara phone number extraction program”. Joonggonara is a web community with 19 million members. It is a resale platform where the members of the community register and sell second-hand goods they no longer use.
Figure 2 shows a part of the “General Joonggonara Operation Policy (Revised on Oct 20, 2021)”. According to this excerpt, when the user writes a post to sell a product, if they have not agreed to the exposure of their phone number, they must manually enter the seller’s information. Figure 3 shows an example sales post with the seller’s information, and either the seller’s phone number or a virtual phone number is displayed. Buyers can click the ‘See contact details’ button to check the seller’s phone number.
The PUP program collects the phone numbers, and Figure 4 shows the part of the PUP program that crawls posts from websites. Mobile website addresses are used for access, and the UserAgent value can be seen; It uses a value that accesses Chrome from a Windows 10, x64 environment.
The result.saleInfo.phoneNo value is checked from the accessed web page, and the program has a feature to save any of these values that start with “010-“ and their IDs. If there are no matching values, it reads result.article.contentHtml. It uses regular expressions to search for numerals and character values that are used in phone numbers and saves them alongside the ID.
Examination of the posts in the web community showed that there were some posts where the phone number was exposed in the body of the post in character format or a mix of both characters and numerals (0 1 zero one 2 three 4 5 six 7 eight). Judging from the features of the phone number-collecting PUP program, it seems that both formats can be collected. Thus, when users write posts in communities, they must select a method that does not directly expose their phone numbers.
Unauthorized collection of personal information using PUP programs such as the one above is punishable by law. Users must also be cautious about exposure when sending personal data over the Internet.
Subscribe to AhnLab’s next-generation threat intelligence platform ‘AhnLab TIP’ to check related IOC and detailed analysis information.