Search Basics

This section contains the most important concepts related to the Netlas Search tools. We recommend that you read the information below carefully if you want to take full advantage of the Netlas Search engine.

Documents

Netlas stores its data in the form of structured documents. Here is a list of document types for each data collection:

Responses search: each document represents a scan result (server response)
and has its unique URI, e.g. https://app.netlas.io:443/host/
DNS search: each document represents a domain,
e.g. app.netlas.io
IP WHOIS search: each document represents an IP range,
e.g. 135.125.236.0 - 135.125.237.255
Domain WHOIS search: each document represents a domain,
e.g. netlas.io
Certificate search: each document represents a certificate,
e.g. certificate with MD5 fingerprint 7af6b9b6ac262c3d3dd82d7a44032ba4

Documents consist of fields and values. For example, each response document has ip, port, and protocol fields.

Search Query Language

Netlas Search tools (except the IP/Domain Info) involve the use of a simple query language, called Apache Lucene query. It is necessary to understand how to construct search queries in order to use Netlas effectively.

An elementary search query consists of a document field (where to search) and a search phrase (what to search), separated by a colon.

Let's look at the simplest search as an example:

ip:1.1.1.1

This search query means "Filter out documents that have the value 1.1.1.1 in the ip field". This is why fields are also called filters sometimes.

Some fields contain subfields. They are specified as field.subfield. For example, the geo field, which is available in the Responses search, contains the country subfield.

geo.country:US

If you want to search for a phrase, use quotes to combine words:

http.title:"Mail server"

If the search field isn't specified, the search will be performed in the default fields. Each search tool uses its own set of default fields.

Try the following search query in the Responses Search tool:

"index of"

Use an asterisk ("*") operator to search for any value. For example, to search for web pages with any title:

http.title:*

You can use an asterisk ("*") to address any field or subfield. But in this case, you have to escape it with the back-slash (""). The following query will return responses of any protocol with the word "email" in the banner:

\*.banner:email

If you query an asterisk ("*") without specifying any field, Netlas returns the most recently collected documents from the selected data collection.

Mapping

All available for searching fields are listed in the mapping. Each data collection has its own mapping. Comments on the purpose of the fields and examples of search queries are given in the following sections of this manual.

Click on any field to put it on the search string.

Complex Queries

Use AND, OR, and NOT operators (which can also be written as &&, ||, and !) to build complex search queries:

prot7:ssh AND NOT port:22

The default operator is AND. It uses when you put two or more search terms separated by space and do not specify any operator between them. The following two queries return the same results:

prot7:imap port:993

prot7:imap && port:993

Use brackets to combine search terms:

port:(8080 OR 8088 OR 8888) protocol:http

(ssh:* geo.country:AU) !port:22

Note that the ! operator is not separated by a space.

Range Search

Specify ranges for date and numeric fields using the [ TO ] syntax or <, >, = symbols for one-sided ranges.

host:1.1.1.1 port:<=1000

ip:[173.194.222.0 TO 173.194.222.255]

Fields of type 'IP address' additionaly support search for a subnet entry:

ip:"173.194.222.0/24"

Escape the slash if you don't use the quotes.

Full-Text Search

Netlas stores text values using two data types: 'TEXT' and 'KEYWORD'.

The vast majority of fields are of the 'TEXT' type. This data type is designed to perform full-text search through large amounts of text data, where the search returns all relevant results rather than just exact matches.

Fields of type 'TEXT' store data in the tokenized and normalized form. You can read about the tokenization here. But essentially, it means that all punctuation, special and service characters are ignored during the search. Search also ignores the difference in letter case, and even some forms of words, e.g. there is no difference between plural and singular.

Let's look at a couple of examples to understand the pros and cons of tokenization:

http.title:"index of"

Documents with the titles index of, Index of, Index of: INDEX OF, INDEX of, and even index /of will be found.

Another example:

http.body:(stephen hawking)

Among other results, the search will also return matches containing Stephen William HAWKING and documents with links like http://some-blog.com/label/Stephen%20Hawking/.

Exact Match Search

For fields where it makes sense to search only by exact match, the 'KEYWORD' type is used. Mostly these are domain names and email addresses. Fields of the 'KEYWORD' type don't use tokenization or normalization.

Try the folowing query in the DNS Search:

domain:Netlas.io

Most likely you will get nothing, because there is no exact match for Netlas.io due to the first capital letter. But the following query should works:

domain:netlas.io

It is important to understand that the 'KEYWORD' type is necessary in some cases. Let's imagine that we used the 'TEXT' type for domain names. Then the above search would also return app.netlas.io. and even netlas.io-io-ho.com.

There is a little trick if you want to search for an exact match in a 'TEXT' type field.

Try to address it like field.keyword:

http.title.keyword:"cPanel Login"

Most 'TEXT' fields have 'KEYWORD' subfields for exact match searching. This trick doesn't work with fields longer than 256 characters (like http.body) because it's too resource-intensive.

Fuzzy Search

Use fuzzy querries with ~ operator to search for similar spelling domains and or different forms of words. You can specify the distanse with number after the fuzzy operator. Supported distances are 1 or 2, default is 2.

domain:netlas.io~

domain:google.com~1

Wildcards and Regex

Finally, you can use wildcards ("*" and "?") and regular expressions.

The following query in the DNS Search returns google level-2 domains:

domain:google.* level:2

Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes ("/").

The same query with regex:

domain:/google\..*/ level:2

Lucene’s regular expression engine does not use the Perl Compatible Regular Expressions (PCRE) library, but it does support the most of standard operators.

Search Tips

It may seem a little confusing at first to write search queries. Try learning from examples:

Find some examples right in the app under the search bar.
More examples are available in the Netlas Dorks repository.
Netlas and feeds are also good sources of relevant query examples.

Refining the Query

While searching, use the right panel to refine your query. For example, filter results by geolocation or specific network.

The sidebar can be hidden by clicking the Show/Hide sidebar button on the toolbar.

Please note that you can switch between mapping and summary using the tabs at the top of the sidebar.

Favorites

If you plan to repeat a search after time when new data will be collected, you can save the search query to your favorites. To do this, click on the Add to favorites button on the right side of the search bar.

You can lable and group your favorites.

View Preferences

Some search tools support view configuration. Click the View configuration button located on the toolbar to change the settings.

Historical Search

Netlas collects data in cycles. For example, resolving all domains from the first to the last is a cycle. Each cycle is writed in a separate index. By default, the search is performed in the latest index. Click the Index selection button on the right side of the search bar to change the search index (see image above).

For most data collections, the default index is stored in a high access speed memory (we use large-capacity SSDs). Changing the index significantly affects the search execution time. Therefore, pay attention to the labes:

fast – The index is stored in a high access speed memory.

slow – The index was moved into memory with a slow access speed.

You may have access to indexes that are currently being updated:

indexing – The index is not yet fully built.

full – The index is fully built.

With facet analysis, you can divide your search results into groups. Any document field can be a grouping criterion.

For example, you can group by the port field all responses in the index. This will allow you to find out exactly which ports Netlas scans. To do this, query an asterisk ("*"), go to facet analysis by clicking the Group search results button on the toolbar and select the port field as a grouping criterion.

Facet analysis is quite resource-intensive, so we limit the number of groups in the search results to one thousand.

Once the search query is built and the search results match what you wanted, you can share the search or download the results.

Find the Download results button on the toolbar to download search results in JSON or CSV format. It is recommend to select only those fields that you need in the download options. This way you will get results faster.

How to download large sets of search results that exceed download limits? Read the FAQ →

By clicking on the Share button, you can generate a short link and QR-code for the current search query.