Documentation › Trackers

Introduction

You may want to subscribe transversally to any entry that matches a certain query, rather than subscribing to single feeds. The most common use case is to subscribe to any entry which matches a given keyword.

Only the “Tracker” users can use track feeds. However, the API calls for subscribing, unsubscribing, listing or retrieving past entries are the same that “Subscriber” users can use for regular feeds, using either the PubSubHubbub API or the XMPP API.

Building Track Feeds

Track feeds are virtual feeds in a sense that they’re generated on the fly as long as their URL matches the following criteria:

  • Can be http or https
  • Uses the track.superfeedr.com hostname
  • Has a query query string param whose value is the query itself.

Here’s an example of track feed: http://track.superfeedr.com/?query=superfeedr. You can also use a format query string with atom or json as values. You should refer to our schema section for details on both ATOM and JSON.

Queries

Queries are the equivalent of search queries. They contain several members separated by spaces. Each member is either a keyword or a flag with an associated value. Spaces are interpreted as AND.

Some special characters are also supported:

  • + signifies AND operation
  • - negates the following member
  • | signifies OR operation
  • " (quote) wraps a number of tokens to signify a phrase for searching
  • ( and ) indicate precedence

Here are valid examples of queries:

Query Details
superfeedr will match any mention of superfeedr
laurel hardy will match any menion of laurel AND hardy in the same document. Both words can be appart.
romeo -juliette will match any mention of romeo that does not have a mention of juliette
paris (texas | france) will match any mention of paris along with texas or france.
"new york" will match any mention of new york (with new and york being consecutive.)
"AAPL" will match any mention of AAPL but not aapl.

Site

The site: flag allows you to define from which site the content must have been published. The value needs to be any domain or subdomain and will match the host from which the content has been published.

The value is a hostname (full domain or subdomain) of the publishing site. You can add at most one site: per query.

Examples:

Query Details
pubsubhubbub site:blog.superfeedr.com will match any mention of pubsubhubbub published on our blog.
superfeedr site:techmeme.com will match any mention of superfeedr published on Techmeme

You can use the negation of this flag by using -site: as a flag. In this case, however, you can use several -site: flags.

This is useful when refining tracking feeds for which a lot of content is coming from the same sources.

Query Details
apple -site:techmeme.com -site:techcrunch.com will match any mention of apple unless they're from either Techmeme or Techcrunch.

The link: flag allows you to select only the documents which include a link to a a specific page, or a domain.

You can add at most one link: per query.

Query Details
link:https://superfeedr.com/ Only entries with a link to our home page.
link:runscope.com Only entries with a link to any page with the runscope.com domains.

Similarly to site, You can use the negation of this flag by using -link: as a flag. You can have multiple -link: values.

Query Details
pubsubhubbub -link:superfeedr.com -link:google.com will match any mention of pubsubhubbub unless it points to either superfeedr.com or google.com.

Language

Superfeedr is able to extract the language of every entry individually. This means you can filter entries matching a specific language or excluse those from specific languages using -language. A given entry cannot have more than one language, which mean you can’t use more than one language operand. However, you can exclude multiple languages using multiple -language operands. The value should be the 2 letter value of the language using ISO_639-1.

Please note that in some cases, we are unable to extract the language (not enough test, contradicting text with combination of 2 languages… etc).

Query Details
language:en will match only entries explicitly using the english language. If we are unable to extract the language, this won't match.
-language:it will exclude entries which use the italian language. If the language can't be determined, the entries will not match.

Popularity

Each feed going through Superfeedr has a popularity ranking. This popularity is a combination of multiple signals and factors: some of them internal and others external (social networks, pagerank… etc). Any feed’s popularity evolves slowly. You can build filters which take the popularity of the source into account and exclude content coming from unpopular sources.

The value should be a range (> or <) to match popularity greater or small than a specific value. Check this blog post to learn about the distribution of these ranges.

Query Details
popularity:>3 will match only entries published in feeds with a popularity greater than 3.

Porn Filtering

By default, Superfeedr tries to identify porn content and will filter it out of matching requests. Please note that this algorithm “learns” from any given feed before it can start to classify an entry as porn, which means that if the exclusion of porn content is an absolute requirement, you should also implement filtering on your side.

We consider any feed as porn with a porn rank higher than 0.2.

That said, for some cases, (building porn filters for example!), it makes sense to disable our porn filter. You can achieve this by adding porn:ok to your query.

Query Details
porn:ok plug Will send any mention of plug, whether they are from porn feeds or not.

Bozo Filtering

Similar to porn filtering, Superfeedr is able to filter out matching entries from feeds we consider broken or spammy. For example, some feeds will generate infinite amounts of data using a random id for each new entry.

We consider any feed as bozo with a bozo rank higher than 0.3.

You can disable this filtering by using bozo:ok.

Query Details
bozo:ok ham Will send any mention of ham, whether they are from spam feeds or not.

Testing

Tracking feeds are prospective, which means that you should use them to receive upcoming entries that match them. Because of that, it’s not always simple to refine search queries because you have to wait for matches to improve them.

Superfeedr offers a search API which lets you match your tracking feeds queries against historical data. You can then quickly identify how to refine your queries for tracking feeds.

POST https://push.superfeedr.com
Parameter Name Note Value
hub.mode required search
query required The query you want to match. Please see previous sections on how to build search queries.
format optional json or atom (default).

Example

curl https://push.superfeedr.com/ 
  -X POST 
  -u demo:demo 
  -d'hub.mode=search' 
  -d'query=superfeedr'

Response

Superfeedr will return 200 with the corresponding representation of the search results matching your query. Please, refer to our schema section for details.

If you receive a 422 HTTP response, please check the body, as it will include the reason for the subscription failure.

Other HTTP response codes are outlined in the HTTP spec.

Scope

The scope for tracking feeds is the total number of feeds processed by Superfeedr. This includes feeds subscribed by subscribers and feeds published by publishers subscribed by at least one subscriber on their hubs.

We are working on extending this coverage to include any subscribed feed on the web.