AI-Powered Contract Solution for Real Estate

  • Developed advanced NLP and machine vision models to analyze real estate contracts
  • Deployed a scalable cloud based solution to process more than 100 million pages of contracts per day in real-time
  • ROI on analytics investment returned in only 3 months in operation
Company Overview
A leading provider of real estate technology with the goal of empowering real estate agents with novel insights about real estate properties and people. The company aggregates public and private data about real properties, including sales data, property values, and market conditions.
Problem
Receiving over 100M pages of real estate contracts and other unstructured documents per day, these contracts need to be analyzed in all stages during the contracting process. Business processes require that:

  • Contracts are verified for completeness. Are signatures or specific contract clauses present?
  • Contracts are filed appropriately and indexed.
  • Contracts can easily be retrieved by real estate clerks using advanced search on embedded metadata (e.g., search by contract parties, property address, contract clauses).

Expensive commercial software to OCR documents was used, but failed to deliver the needed contract intelligence. A scalable solution with advanced document understanding capabilities was needed to keep up with the growing volume of documents.

Solution
We started with a strategic assessment of business processes to understand the client’s requirements of a successful solution. We interviewed business stakeholders and real estate experts to identify key requirements.

Our data scientists used three models to address the contract intelligence challenges:

  • First, we developed a machine vision model using convolutional neural networks (CNNs) to visually scan over the documents to identify locations of key areas in contracts. A machine learning classifier then determine if those areas contain signatures, stamps, barcodes, or tables among others. Identified areas are extracted and made available for search.
  • Second, we implemented custom language models specifically trained on contract language to correct any OCR errors that were introduced by commercial software. In this process, we identified cost savings by replacing the commercial OCR software with open source software. Our implementation achieved better results, while saving 100% of the commercial software’s licensing cost.
  • Third, we augmented pre-trained NLP models to focus on real estate contract language to extract entities and relationships between entities of interest.

With these models in place, we were able to extract the full text and real estate specific metadata from every document.

Our platform architects designed and implemented a cloud-based environment that scales with the volume of documents. The environment:

  • Dynamically scales up during peak times, and scales back during low usage times to minimize cloud costs.
  • Extracts all data using the ML/AI models and indexes all documents, giving users of the platform the ability to search for any and all entities across all document.
  • Implements APIs to make all functionality available to an existing document management system.
  • Creates alerts and notifications based on the status and content of contracts. If a contract is missing vital information, an alert is issued to the responsible real estate agent.

Impact
The contract intelligence solution saves time and money:

  • The status of contracts is verified in real-time, leading to less contract delays and high satisfaction of real estate clients.
  • Advanced search over contract details speeds up real estate agent performance in their daily workflow.
  • The solution realized savings over licensing and per page cost by completely eliminating expensive commercial software.
  • The cost savings over commercial products paid for the engagement in a period of 3 months after the solution was deployed in production.

We’d love to hear from you

Contact us for more information or to discuss what problem we may be able to help you solve.