The use case for applying machine learning in Patent Analytics
The KMX Patent Analytics solution offers Intellectual Property Professionals a platform for unique visual clustering and – machine learning based – categorization for analyzing large text collections like text, social media, websites, email and patents.
In a comprehensive blog independent Patinformatics analyst Tony Trippe introduced the use case for applying machine learning in patent analytics in general and he described the results from working with the KMX Patent Analytics solution.
Machine Learning in Patent Analytics – Part 2: Binary Classification for Prioritizing Search Results
Of the three machine learning tasks covered in Part 1 of this series, classification may be the one that is the least familiar to patent information professionals. The methods used for automatic classification have been around for some time, and have been used by patent offices, publishers and database producers, in association with patent information, but there have not been many commercial tools providing classification capabilities to analysts, and information retrieval specialists. This is unfortunate, since statistical classification can, potentially, lead to enormous benefits for patent information professionals. Before launching into an example of how classification can assist with the identification, and prioritization of relevant references, within large patent document sets, let’s look at some details of the task itself.
Binary classification provides a means for categorizing large collections of patent documents into the references that are likely to be of highest interest to the information professional, and those that are likely not related, but were still retrieved in a broad search. The training set, in this case will be made up of references that are highly relevant to the interests of the analyst. In training the classifier, the analyst will need to identify documents that are off-topic as well, so the classifier can establish a hyperplane that will distinguish between the two categories.
Use Case: Wearable Fitness monitors
Wearable fitness monitors have been discussed previously, and this area of technology will provide the examples used for this post, and a number of the remaining ones, in this series. Aliphcom (doing business as Jawbone) sells the Up fitness monitor while Nike competes with them with the Nike+ FuelBand product. Both organizations sell other products, and have extensive portfolios, which cover their fitness monitors, as well as many additional items. Let’s study how a binary classifier can help identify the patents associated with the Up, and the FuelBand, in the midst of many other documents from these companies. (…)
Binary classification, using an SVM can be a powerful tool for prioritizing patents within a larger collection of documents. One of the best aspects of this method is that the classifiers, once created, can be reapplied to other collections, including classifying new documents that publish on a weekly basis. In this fashion, measures can be taken to maximize recall when searching, and then focus on precision, in a second step, using a classifier (Patinformatics, Anthony Trippe – August 2013).
About Patinformatics and Anthony Trippe
Patinformatics, LLC is an advisory firm established to assist the members of the patinformatics community whether they be individual practitioners, database or tool builders, patent offices, universities or corporations. Anthony (Tony) Trippe is the Managing Director of Patinformatics, LLC. Tony has been a patent information professional for more than eighteen years and has spent the last fifteen years specializing in technical intelligence and patent analytics. Tony is listed in IAM Magazine’s Strategy 300 publication recognizing the top 300 IP Strategists in the world.