Big Content Discovery

Big Content Knowledge Discovery

Today, with vast amounts of information – both structured and unstructured – to be processed, analyzed, and understood, the need to discover the unexpected, to uncover patterns, and to predict outcomes relies on a probabilistic approach that retrieves information close, but not necessarily perfect.

Business users are more and more requiring an approach that is emerging from deterministic – familiar to the world of structured databases and predetermined schemas – probabilistic analytics with its emphasis on discovery, scenarios, and levels of uncertainty.

KMX is a ready to deploy – or integrate – content analytics solution to extract the elements of meaning from each document: names of people, place and things, time, location, sentiment and opinion, ant the relationships among these extracted elements: facts and events, cause and effect, definitions. With KMX you can tag and label documents to improve information retrieval accuracy and create browsing interfaces like content-based discovery dashboards.

KMX is used to explore collections of information and to find relationships in Intellectual Property portfolios, R&D libraries, drug side effects, ties between criminal organisations and terrorists, the influence of media on the agenda of politicians, or the disconnect between the supply and demand of human resources and their skills.

The search-based approach to knowledge discovery of KMX goes beyond simply Google-like searching existing content to find a list of reports. KMX extract information elements and their relationships to each other to enable exploration, analysis and visualization across documents, sometimes without an overt query. It provide pathways through a collection of information and all results tie back to the original document so that the context of the extracted information can be verified,

KMX in your eDiscovery solution

KMX assists Law & Legislation professionals with robust ‘predictive coding’ capabilities to provide all the pertinent – electronic – documents and ignore all that are not pertinent to all parties to a lawsuit in a relatively fast and legally  defensible manner.

E-discovery is the identification, preservation, collection, preparation, review and production of electronically stored information associated with legal and regulatory proceedings.

A legal council that has to discover all potentially relevant documents, from all repositories in whatever format the information exists, eliminate the duplicates but no partial duplicates, separate privileged from non-privileged documents, identify people and documents for depositions, establish a chain of custody, ensure accountability and provide access securely across locations, languages and time-zones.

Key functions of KMX as part of eDiscovery:

  • it provides access to the content that is satisfactory to all parties to the legal matter, even though they are opposing council, and assist users in selecting the relevant data and removing non-relevant or privileged information (based on machine learning algorithms);
  • the collection includes everything that is related to the legal case, but nothing more;
  • over 30 man years of development in a robust text mining and search based technology deliver – and measure! – a high Precision and Recall.

By embedding KMX as part of their solution eDiscovery vendors – or large law firms – can implement capabilities to facilitate technology-assisted review, predictive coding, legal holds and collection. It’s gold-plated machine learning based user supervised classification KMX supports functionalities like content monitoring, alerting, filtering, performance optimization and concept searching.

Many eDiscovery providers are promising ‘predictive coding’ capabilities in their solutions but often with poor results that require a large set of training-data. This is a downside when they use a – traditional – rule based NLP approach compared to KMX; a – modern – SVM based Machine Learning approach.

KMX offers a ready-to-deploy advanced analytical and statistical approach that is needed to help users evaluate the increasing volume of electronically stored information.

Post a Comment

Your email address will not be published. Required fields are marked *