Documentation › Schema & Data Formats

Contents

This section mostly applies to feed subscriptions. If you’re subscribing to arbitrary content resources, we will send you the exact content of the resource to which you’ve subscribed. Some of the status information is available to non-feeds resources as well.

Whatever the original format (RSS, Atom, or any other namespace) is, the notification that we will send to subscribers will use standard ATOM, as well as a few other namespaces detailed below. We will match as much as we can into this format. The overall goal here is to make it easy for the subscriber to consume a consistent schema.

Status

Upon notifications, when subscribing (XMPP only), or when retrieving a resource’s content from Superfeedr, you’ll see that it may include the following information. This data is useful for you to know the current state of a resource. Please note that some items my be missing at some point, either because we haven’t processed the feed yet, or because they wouldn’t be accurate.

Name Note Value
status[@feed]   contains the URL of the resource
title The feed title
http[@code]   last HTTP status code, please see Status Code Definitions
http   the content of that tag is a more explicit log message for your information
next_fetch   the resource will be fetched at most before this time
period   the polling frequency in seconds for this resource (at least 60 seconds for feeds and at least 300 seconds for arbitrary content)
last_fetch   the last time at which we fetched the resource
last_parse   the last time at which we parsed the resource. It happens that we fetch a resource and do not parse it as its content hasn't been modified
last_maintenance_at   Each resource inside Superfeedr has a maintenance cycle that we use to detect stale resource, or related resource. We normally run maintenance at most every 24hour for each resource, but this is a low priority task, so it may go beyond this
entries_count_since_last_maintenance The number of updates in the resource since we last ran the maintenance script. This is a very good indicator of the verboseness of a resource. You may want to remove resources that are too verbose
velocity The number of updates during a maintenance cycle (between 24 and 48 hours). More than the absolute number, the magnitude matters.
popularity Float. Starts at 0 (not popular).The greater the number, the more popular the feed. Popularity is assessed for each feed based on several different signals from the social web, number of clicks, number of subscribers. It also depends on the popularity of the web pages which link to the feed.
porn_rank if available Betwen 0 and 1. The greater the rank, the greater the chances that the feed publishes only porn content.
bozo_rank if available Betwen 0 and 1. The Bozo rank indicates that a feed is probably valid syntactically but likely invalid semantically: feeds with constantly changing unique identifier for new entries will rank high, for example.
generated_ids true Indicates whether the id for each entry was generated by Superfeedr. If this is missing, you can safely assume that we were able to extract the unique id from the feed themselves.

Example

<status feed="http://domain.tld/feed.xml" xmlns="http://superfeedr.com/xmpp-pubsub-ext">
  <http code="200">9718 bytes fetched in 1.462708s : 2 new entries.</http>
  <next_fetch>2013-05-10T11:19:38-07:00</next_fetch>
  <period>900</period>
  <last_fetch>2013-05-10T11:19:38-07:00</last_fetch>
  <last_parse>2013-05-10T11:17:19-07:00</last_parse>
  <last_maintenance_at>2013-05-10T09:45:08-07:00</last_maintenance_at>
  <entries_count_since_last_maintenance>5</entries_count_since_last_maintenance>
  <velocity>12</velocity>
  <bozo_rank>0.1</bozo_rank>
</status>

Entry Schema (feeds only)

Notification entries will have the following form. It is standard ATOM. Please note that an entry might not have all of them.

Here are the components used to build the entries. Please note that they may use specific namespaces.

Name Note Value
link[@href] the url related to the parent node
link[@rel] optional the type of relation to that parent node (alternate, reply... etc)
link[@type] optional MimeType of the link destination (text/html by default)
link[@title] optional the link title

Example

<link href="http://domain.tld/entries/12345" rel="alternate" type="text/html" title="The sky is Blue" />
<link href="http://domain.tld/entries/12345/comments.xml" rel="replies" type="application/atom+xml" title="Comments on The sky is Blue" />

Category

Name Note Value
category[@term] optional, multiple a keyword related to the entry... (tag, category or topic)

Example

<category term="tag" />
<category term="category" />

Point

Name Note Value
entry[@point] optional, multiple geolocation data. Contains a [georss](http://georss.org/) latitude and longitude. It's either extracted from the story or extrapolated from the content.

Example

<point xmlns="http://www.georss.org/georss">47.597553 -122.15925</point>

Author

Name Note Value
author optional, multiple Author information
name optional the author's name (or nickname)
email optional the author's email address
uri optional the author's URI
object-type optional, multiple the object type, defined in the ActivityStreams spec
link optional, multiple links (see above). They can include links to the author's profile, to the user's avatar...

Example

<author>
 <name>John Doe</name>
 <email>john@superfeedr.com</email>
 <uri>http://twitter.com/superfeedr</uri>
 <as:object-type>http://activitystrea.ms/schema/1.0/person</as:object-type>
 <link type="image/png" title="John Doe" href="http://domain.tld/john.png" rel="image"/>
</author>

Object

Name Note Value
object optional, multiple ActivityStreams
object-type optional, multiple the object type, defined in the ActivityStreams spec
id optional the unique identifier of the object
title optional the title of the object
published optional the publication date (iso8601) of the object
updated optional the updated date (iso8601) of the object
content optional the content of the object
author optional, multiple author information (see above)
category optional, multiple categories (see above)
link optional, multiple links (see above)

Example

<as:object-type>http://gowalla.com/schema/1.0/spot</as:object-type>
<as:object-type>http://activitystrea.ms/schema/1.0/place</as:object-type>
<id>object-id</id>
<title>Title of the Object</title>
<published>2013-04-20T15:00:40+02:00</published>
<updated>2013-04-21T14:00:40+02:00</updated>
<content>hello world</content>
<author>
  <name>Second</name>
  <uri>http://domain.tld/second</uri>
  <email>second@domain.tld</email>
</author>
<link type="text/html" title="" href="http://domain.tld/object/2" rel="alternate"/>
<link type="text/html" title="" href="http://domain.tld/object/2" rel="alternate"/>

Verb

Name Note Value
verb optional, multiple defined in the ActivityStreams spec

Example

<as:verb>http://activitystrea.ms/schema/1.0/post</as:verb>

Entries

Entries may include all the above elements. They also contain specific nodes, listed below.

Name Note Value
entry[@xml:lang] optional The language of the entry. It's either extracted or computed from the content (the longer the content, the more relevant).
entry[@title] The title of the entry.
entry[@published] optional The publication date (iso8601) of the entry.
entry[@updated] optional The last updated date (iso8601) of the entry.
entry[@content] optional The content of the entry. Check the type attribute to determine the mime-type.
entry[@summary] optional The summary of the entry. Check the type attribute to determine the mime-type.
entry[@source] optional The source of the entry. It includes all the available feed level elements, such as the feed title, the feed links, the feed's author(s)... etc. It's extremely useful for track feeds.
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:geo="http://www.georss.org/georss" xmlns:as="http://activitystrea.ms/spec/1.0/" xml:lang="en">
   <id>domain.tld:09/05/03-1</id>
   <published>2013-04-21T14:00:40+02:00</published>
   <updated>2013-04-21T14:00:40+02:00</updated>
   <title>Entry published on hour ago</title>
   <content type="text">Entry published on hour ago when it was shinny outside, but now it's raining</content>
   <summary type="text">Entry published on hour ago...</summary>
   <geo:point>47.597553 -122.15925</geo:point>
   <link type="text/html" title="" href="http://domain.tld/entry/1" rel="alternate"/>
   <link type="application/atom+xml" title="" href="http://domain.tld/entry/1.xml" rel="replies"/>
   <link type="image/png" title="A beautiful picture that illustrates the entry" href="http://domain.tld/entry/image_.png" rel="picture"/>
   <category term="Things"/>
   <category term="Picture"/>
   <author>
      <name>First</name>
      <uri>http://domain.tld/first</uri>
      <email>first@domain.tld</email>
      <id>id-first</id>
      <title>First</title>
      <as:object-type>http://activitystrea.ms/schema/1.0/person</as:object-type>
      <as:object-type>http://activitystrea.ms/schema/1.0/dude</as:object-type>
      <link type="image/png" title="A beautiful picture that illustrates the author" href="http://domain.tld/first.png" rel="picture"/>
      <link type="text/html" title="The profile page of the author" href="http://domain.tld/first/profile" rel="alternate"/>
  </author>
  <author>
      <name>Second</name>
      <uri>http://domain.tld/second</uri>
      <email>second@domain.tld</email>
  </author>
  <as:verb>http://activitystrea.ms/schema/1.0/post</as:verb>
  <as:verb>http://activitystrea.ms/schema/1.0/publish</as:verb>
  <as:object>
      <as:object-type>http://gowalla.com/schema/1.0/spot</as:object-type>
      <as:object-type>http://activitystrea.ms/schema/1.0/place</as:object-type>
      <id>object-id</id>
      <title>Title of the Object</title>
      <published>2013-04-20T15:00:40+02:00</published>
      <updated>2013-04-21T14:00:40+02:00</updated>
      <content>hello world</content>
      <author>
         <name>Second</name>
         <uri>http://domain.tld/second</uri>
         <email>second@domain.tld</email>
     </author>
     <link type="text/html" title="" href="http://domain.tld/object/2" rel="alternate"/>
     <link type="text/html" title="" href="http://domain.tld/object/2" rel="alternate"/>
  </as:object>
  <source>
    <id>feed-id</id>
    <title>feed-title</title>
    <updated>2013-04-21T14:00:40+02:00</updated>
    <link href="http://domain.tld/entries/12345" rel="alternate" type="text/html" title="The sky is Blue" />
    <link href="http://domain.tld/entries/12345/comments.xml" rel="replies" type="application/atom+xml" title="Comments on The sky is Blue" />
  </source>
</entry>

JSON

Superfeedr offers the ability to subscribe to Atom and RSS feeds, but receive notifications (both via XMPP and PubSubHubbub) in JSON. It’s a mapping of our Atom schema. This mapping was created with the goal of being compatible with the OSync and ActivityStreams JSON schemas.

  • Dates: the dates shown are Unix Timestamps (seconds since Epoch), expressed in UTC.
  • Keys: expressed as camel case.

Example

{
 "status": {
   "entriesCountSinceLastMaintenance": 24,
   "velocity": 50,
   "lastParse": 1290793065,
   "period": "600",
   "lastMaintenanceAt": 1290778665,
   "feed": "http://domain.tld/feed.xml",
   "lastFetch": 1290796665,
   "code": 200,
   "title": "A wonderful feed",
   "nextFetch": 1290803865,
   "http": "Awesome we got the feed right"
  },
  "items": [
   {
    "geo": {
     "type": "point",
     "coordinates": [
      47.597553,
      -122.15925
     ]
    },
    "standardLinks": {
     "picture": [
      {
       "type": "image/png",
       "href": "http://domain.tld/entry/image_.png",
       "title": "A beautiful picture that illustrates the entry"
      }
     ],
     "replies": [
      {
       "type": "text/html",
       "href": "http://domain.tld/entry/1.xml",
       "title": ""
      }
     ]
    },
    "permalinkUrl": "http://domain.tld/entry/1",
    "verb": "publish",
    "content": "Entry published on hour ago when it was shinny outside, but now it's raining",
    "published": 1271851240,
    "actor": {
     "displayName": "First",
     "image": "http://domain.tld/first.png",
     "permalinkUrl": "http://domain.tld/first/profile",
     "id": "id-first",
     "objectType": "dude",
     "title": "First"
    },
    "categories": [
     "Things",
     "Picture"
    ],
    "id": "domain.tld:09/05/03-1",
    "object": {
     "permalinkUrl": "http://domain.tld/object/2",
     "content": "hello world",
     "published": 1271768440,
     "actor": {
      "displayName": "Second",
      "permalinkUrl": "http://domain.tld/second"
     },
     "id": "object-id",
     "updated": 1271851240,
     "title": "Title of the Object",
     "objectType": "place"
    },
    "title": "Entry published on hour ago",
    "updated": 1271851240,
    "source": {
      "id": "http://blog.superfeedr.com/",
      "title": "Superfeedr Blog",
      "updated": 1245776753,
      "permalinkUrl": "http://blog.superfeedr.com/"
    },
    "language": "en"
   },
   {
    "permalinkUrl": "http://www.macrumors.com/2009/05/06/adwhirl-free-ad-supported-iphone-apps-can-very-lucrative/",
    "published": 1241616887,
    "content": "Mobile advertising company AdWhirl issued a report (PDF) that details the success of some of the top ad-supported iPhone apps. AdWhirl serves 250 million ad impressions monthly to over 10% of the top 50 Apps in the App Store.br /\n   br /\n   br /\n   ...",
    "id": "http://www.macrumors.com/2009/05/06/adwhirl-free-ad-supported-iphone-apps-can-very-lucrative/",
    "title": "AdWhirl: Free Ad-Supported iPhone Apps Can Be Very Lucrative"
   }
  ],
  "subtitle": "This is certainly a wonderful feed that you need to read!",
  "standardLinks": {
   "canonical": [
    {
     "type": "application/atom+xml",
     "href": "http://feed.domain.tld/main.xml",
     "title": null
    }
   ],
   "self": [
    {
     "type": "application/atom+xml",
     "href": "http://domain.tld/feed.xml",
     "title": null
    }
   ]
  },
  "id": "http://domain.tld/feed.xml",
  "title": "A wonderful feed",
  "updated": 1290800265
 }

It is recommended that you check the schema for some of the feeds to which you subscribe to make sure that all the required field for your application are included. Feel free to get in touch if you miss any content in the feeds.