Interview with William Bushee
BrightPlanet is one of the first, if not the first, commercial company to harvest content from the Deep Web. Like other cyber OSINT vendors, BrightPlanet focuses on serving governmental entities, law enforcement, security professionals in the financial sector, and related business sectors.
The founder of the company (Michael Bergman) is credited with coining the phrases Deep Web and invisible Web. With the rise of the Dark Web (online information accessed via the TOR router and person-to-person authorizations), BrightPlanet has begun to refine its technology to make this information more accessible to law enforcement, security, and intelligence professionals.
In the 1990s, as information became available on the public Internet, BrightPlanet began to develop systems and methods to provide access to high-value information. When paywalls and registration were put in place to ensure that certain content such as databases would be available to an authorized user, BrightPlanet undertook that onerous task of writing adaptors to access this content and perform additional operations to give the information context.
Since the company’s inception in 2001, the firm has been in the forefront of providing the firm’s clients with access to information not available in a standard search of Google.com, Facebook.com, Topsy.com, or any other popular Web indexing service.
At the seminar, BrightPlanet stressed that the phrase “Deep Web” is catchy but it does not explain what type of information is available to a person with a Web browser. Some Web masters require a user to enter parameters for a query. Those parameters are passed to the query processing system and the matching information is pulled from a database and the results are rendered in a browser. A familiar example is querying a dynamic database, like an airline for its flight schedule. Other types of “Deep Web” content may require the user to register. Once logged into the system, users can query the content available to a registered user. A service like Bitpipe requires registration and a user name and password each time I want to pull a white paper from the Bitpipe system. BrightPlanet can handle both types of indexing tasks and many more. BrightPlanet’s technology is used by governmental agencies, businesses, and service firms to gather information pertinent to people, places, events, and other topics
I spoke with William Bushee, as part of the research for the CyberOSINT Seminar, held in Washington, DC in February 2015.
What’s your background?
My background is engineering. When we first launched BrightPlanet in 2001, I developed our initial harvest engine. At the time, little work was being done around harvesting. We filed for a number of US Patents applications for our unique systems and methods.
Do you have patents for your technologies?
Yes, we were awarded eight, primarily around the ability to conduct Deep Web harvesting, a term BrightPlanet coined.
What’s your present role at BrightPlanet?
Over the years I’ve moved from a straight engineering role to management to operations, which is where I spend the bulk of my time today. Occasionally, I do jump in and write a little code, but those days are pretty rare. Plus, my engineers prefer it when I don’t get that involved.
Where did you get the idea for your cyber OSINT service?
The idea emerged from conversations among our team. We did our homework and realized that we had to create our own tools, techniques, and information methods to deal with content accessible via the Internet and eventually any network supporting certain communication protocols.
How do you describe your company to a potential customer or partner?
Marketing is very challenging. As I think about our approach to explaining what we do, and its value to our clients, we continue to hit certain main benefits our technology provides.
Would you give me a flavor of what you say to a potential customer?
At BrightPlanet we believe that you can do anything with data. Not just any data, but the right data in the right way.
We provide enriched Web content to our clients for analytics and intelligence. How we do that, and what is included, depends on the specific project.
Some projects require a small amount of harvesting while others may require millions of documents per month collected from both the surface web and the Deep Web. We harvest data to help our customers to anything.
Events like the Sony hack always increase inquires however, a lot of the inquiries are interested in protecting against internal content that don’t require Web collection capabilities.
So you combine your technology with professional services for your clients?
Not exactly. But you are close. We pride ourselves on working with our clients as members of their team. We adapt to the requirements of each engagement. We provide what is needed for the client to achieve his or her objectives, and in most cases we use our systems and methods as a foundation and then tailor our work to fit into the specifics of each job.
How have your products and services changed since introduced?
As you know, we have been providing Deep Web and Dark Web services to our clients for many years. We are coming up on our 15th anniversary.
Our services have evolved continuously since our first solution.
What are some of the obvious changes you have made?
There are the obvious changes, like migrating from an on-site license model to a SaaS model. However, the biggest change came after realizing we could not put our customers in charge of conducting their own harvests. We thought we could build the tools and train the customers, but it just didn’t work well at all. We now harvest content on our customers’ behalf for virtually all projects and it has made a huge difference in data quality. And, as I mentioned, we provide supporting engineering and technical services to our clients as required. Underneath, however, we are the same sharply focused, customer centric, technology operation.
What are the benefits you highlight for your customers?
That’s a good question. The key benefits for our customers are being able to achieve their goals, increase revenue, increase customer share, and save time and money.
We’ve seen many of our customers use our Data-as-a-Service model to increase revenue and customer share by adding new datasets to their current products and service offerings. These additional datasets develop new revenue streams for our customers and allow them to stay competitive maintaining existing customers and gaining new ones altogether.
Our Data-as-a-Service offering saves time and money because our customers no longer have to invest development hours into maintaining data harvesting and collection projects internally. Instead, they can access our harvesting technology completely as a service.
Why are cyber centric and content processing services important?
As your seminar showed, there is a growing awareness of the importance of automated collection and analysis. I think interest is going to continue to grow as a response to more awareness about the value of digital information availability.
How do you manage customer expectations for your products and services?
One of the biggest expectations we must overcome are customers who hear about the Deep Web and the Dark Web from mass media. The customer sometimes assumes we have some magical index with all the secret information already collected. Those expectations come with a strong reality check. Once those customers learn the intricacies of the Deep Web, they quickly understand that the media doesn’t always get the fine details right.
How can a financial institution like Bank of America or Prudential Insurance make use of your products and services?
Financial institutions are a new market we are branching into, mostly because of their immediate needs to provide better intelligence to meet regulations. Our services fit well because large banks are realizing they need to better monitor Cyber intelligence, not just reacting after a breach or issues occurs.
The insurance industry is also adopting social media data in a big way. For the past two years, we have been providing various services to the insurance sector. Our role is very much a behind-the-scenes harvest solution and we keep a lot of our insurance work proprietary.
What is the typical set up time for your system before a customer can make use of the technology? Are there any ways that a customer can prepare to move to this technology to make it easier?
We offer a few services which can be setup in just a few minutes — BlueJay, Deep Web Monitor and our data feeds. These are low-cost tools which appeal to specific markets. None of the tools require more than a 15-minute demo or YouTube tutorial.
Our larger, business-centric services (Outpost, AuthentiWeb and our Deep Web Harvester) take days to weeks to setup and start providing value. These are large ticket services custom to a client’s needs and typically not shared across multiple customers. If a customer has their needs well documented, we can have a solution up and running within a few days, but typically they do not know the sources or the lists necessary to get going on day one.
Customers who know which websites they want to harvest and what type of analytics they will be using can begin quickly. Our Data Acquisition Engineer team has a well-organized project management plan we use for new clients. We have seen customers up and running within a few days if they come prepared with the problems they want to solve.
When estimating the cost of your products and services, what is a ball park figure for year one?
Our services range in price from $300/month for a BlueJay subscription to hundreds of thousands of dollars per year for a custom Outpost or AuthentiWeb solution. Costs will vary based on the level of effort to build and maintain the harvests and data integrations. We try to keep costs low the first year because we know follow-up years will require less effort to maintain.
Average cost starts around $75,000 for year one and we typically maintain that same price for the first three years of a project.
What are the key differentiators for your firm’s cyber intelligence products or services?
Our focus is on customer service and our patented systems and methods used to harvest OSINT content.
What are the areas of research which your firm is considering for your next generation solution?
The biggest area of research right now is expanding our offerings through integrations with our partners’ platforms. We know that we cannot provide the best analytic or visualization solution for every customer so we have been working closely with other technology to offer a complete end-to-end solution with joint offerings.
Another area we have expanded is our support for Dark Web content, which is a niche or specialist market. We do see some clients who have a limited need for Dark Web data, and we want to be able to offer that as a service.
What are the trends having a direct impact on your firm’s product or service?
That is a difficult question. I would say that there are two big trends that we see from our customers and partners; collection of more “Dark Web” data and the increased use of Web data for analytics.
The “Dark Web” versus “Deep Web” discussion is one that we have regularly. BrightPlanet generally defines the “Dark Web” as purposely obfuscated content, but we have a much longer post on our blog that gets into all the technical details. Our Deep Web harvester can pull content from some Dark Web sources, including TOR, although we try to avoid diving deep into that part of the internet. The recent coverage of Dark Web on TV shows, such as House of Cards, boosts inquiries. The more exposure into what is taking place online, Deep or Dark, is only a positive for the entire industry.
And the second trend?
The second trend, and the one that with greater impact to our offerings, is the increased utilization of Web data for analytics. BrightPlanet brands its content as Data-as-a-Service, a spin on Software-as-a-Service, where instead of clients collecting and tagging web data, it provides those services and then pushes the data directly to the client through an API. This is going to be the trend that propels our content to more clients and partners.
How do you think automated threat detection and collection will evolve?
Quickly. Beyond that, it is anyone’s guess. This is an emerging market and there is a lot of innovation right now with our partners, especially around social networking and analytics.
What industry is seeing the most rapid adoption of this type of technology?
Two industries rapidly adopting this technology are online fraud and security/risk management. In both cases, we are only one part of a larger solution where the ability to quickly locate, harvest, enrich and supply content through a Data-as-a-Service model works directly with other technologies to provide end-to-end solutions.
How do you market these services?
Many of those solutions are actually sold through our partners under their brand rather than ours.
How does an interested party contact your firm?
I’m happy to jump on the phone with anyone to talk technology; interested parties can setup a meeting request with me through my page at our website. My VP of Business Development, Tyson Johnson, has worked in this field as a practitioner for years; he is also a great resource for those who need advice on leveraging Cyber data for intelligence projects. Anyone can request a meeting with him through his page at our website at as well.
ArnoldIT Cyber OSINT Comment
BrightPlanet is an excellent resource to specialized content services. In addition to providing a client-defined collection of information, the firm can provide custom-tailored solutions to special content needs involving the Deep Web and Dark Web. The company has an excellent reputation among law enforcement, intelligence, and security professionals. The BrightPlanet technologies can generate a stream of real-time content to individuals, work groups, or other automated systems. If you are exploring custom content to support an investigation or sensitive security challenge, BrightPlanet warrants a close look.
Stephen E Arnold, March 26, 2015
No comments yet.