Home
Code Name: Searchy the Search Engine

Code Name: Searchy the Search Engine

to : We built a search engine to snuff out compliance issues that keeps getting smarter.

Showcase & Media

Screenshots

Check out photos of this project

Architectural Diagrams

Check out architectural diagrams of this project

Problem

What was the problem to be solved?

The finance industry is heavily regulated and so firms have to stay on top of their employees to make sure they're not violating any rules, regulations or laws.

This has particularly become problematic in the digital age because employees have the ability to break these rules more easily than ever and often times, without even knowing it.

Locking down and monitoring digital infrastructure has long been the domain of the IT department.

However, these are often kludgy and required archaic searches for IP addresses and indicators in log files to see if improper sites were accessed or certain communications were made.

Solution

What was the proposed solution?

We proposed a front-end that would allow compliance directories to monitor employees and retroactively search for violations without having to contact the IT department and send them on a search for a needle in a haystack.

Additionally, we added the ability for administers to note whether suspicious activity was or was not a compliance issue so Searchy would get better at flagging potential issues.

Challenges

What challenges arose during the project?

The main challenge was identifying the right technology that could ingest data from hundreds of thousands of sources, parse them and store them in a searchable format for an indefinite amount of time in a scalable fashion.

Technical

What was the technical approach to the project?

Logstash is an Open Source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to a “stash.”

In our case, the stash was Elasticsearch on AWS.

The raw data was sent through AWS Kinesis where it was processed, parsed and indexed on Elasticsearch and also fed into a Machine Learning notebook within Amazon SageMaker.

We built the front-end in React which allowed the compliance directors to search through the massive database in a very Google-like fashion.

Management

What was the project management approach to the project?

We utilized Agile Scrum with one-week iterations and weekly deployments prior to each IPM.

We only needed one Research Sprint at the beginning of the project and this was basically to figure out what all the data sources would be and how we would integrate with them.

From there, we proceeded with development sprints until the launch of the project.

Lessons

What did you learn from working on this project?

We learned just how much data flows through a larger financial institution and how to create a scalable pipeline to tag that data and make it searchable.

Benefits

How did this project benefit the client?

After the completion of the project, compliance directors were able to do a large portion of the monitoring without involving company IT.

This included setting up proactive monitoring for words, names and destinations that would escalate attention to particular areas of the company.

Directors would also have a historical record as well. So if it came out that a certain site was problematic, directors could see who accessed that site going back to the inception of the project.

Why Gunner?

Why was Gunner selected for this project?

Gunner Technology heard about the compliance problem and proposed this solution.

We built a risk-free prototype to demonstrate how it would work, which was immediately hailed.

Proficiencies

What tools, techniques and methodologies were used on this project?

[object Object]

Agile

Agile software development refers to a group of software development methodologies based on iterative development, where requirements and solutions evolve through collaboration between self-organizing cross-functional teams

[object Object]

Amazon API Gateway

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.

[object Object]

Amazon Athena

Start querying data instantly. Get results in seconds. Pay only for the queries you run.

[object Object]

Amazon CloudFront

Amazon CloudFront is a content delivery network offered by Amazon Web Services.

[object Object]

Amazon Elasticsearch Service

Fully managed, reliable, and scalable Elasticsearch service.

[object Object]

Amazon EMR

Distribute your data and processing across a Amazon EC2 instances using Hadoop

[object Object]

Amazon Kinesis

Store and process terabytes of data each hour from hundreds of thousands of sources

[object Object]

Amazon Redshift

Fast, simple, cost-effective data warehouse that can extend queries to your data lake

[object Object]

Amazon SageMaker

Build, train, and deploy machine learning models at scale

[object Object]

AWS Lambda

AWS Lambda lets you run code without provisioning or managing servers.

[object Object]

Elasticsearch

Elasticsearch is an Open Source, Distributed, RESTful Search Engine

[object Object]

Git

Fast, scalable, distributed revision control system

[object Object]

Node.js

Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications

[object Object]

React

React is a JavaScript library for building user interfaces.

S

Scrum

Scrum is a framework for project management that emphasizes teamwork, accountability and iterative progress toward a well-defined goal.

[object Object]

Serverless Framework

Build web, mobile and IoT applications with serverless architectures using AWS Lambda, Azure Functions, Google CloudFunctions & more