Subscribe to Apache Software Foundation infrastructure events
The ASF employs a plain-text publisher/subscriber service called PyPubSub at pubsub.apache.org:2069
, which lists most events that happen within the Foundation's infrastructure.
Currently, the service streams the following events:
- Subversion commits
- Git commits and pushes
- Emails to publicly archived lists
- JIRA updates
- Pull Requests and Issues from GitHub
- Staging and publishing notifications sent via our .asf.yaml offering.
Events are delivered as JSON objects in a chunked response stream, with each new chunk being either an event payload or a keep-alive ping.
How to subscribe
Subscribers can pick one or multiple topics to subscribe to, with more specific subscriptions getting fewer, but more specific, event payloads. Construct subscriptions in the form of: http://pubsub.apache.org:2069/topics/go/here
, and separate the topics you want to subscribe to with forward slashes.
The service returns events that match all of the topics you are subscribed to. To subscribe to multiple topic batches in an OR'ed way, you may use a comma to separate your batches of topics.
Some examples:
- To subscribe to all svn commits;
http://pubsub.apache.org:2069/svn/commit
- To subscribe to all git commits;
http://pubsub.apache.org:2069/git/commit
- To subscribe to all git events (push+commit) for whimsy.git;
http://pubsub.apache.org:2069/git/whimsy
- To subscribe to all
netbeans.apache.org
emails:http://pubsub.apache.org:2069/email/netbeans.apache.org
- To subscribe to PRs opened against
beam-foo.git
:http://pubsub.apache.org:2069/github/beam-foo.git/pr
- To subscribe to all commits, both Subversion and git:
http://pubsub.apache.org:2069/commit
- To subscribe to all JIRA events for the HADOOP JIRA instance:
http://pubsub.apache.org:2069/jira/HADOOP
- To subscribe to both JIRA and email streams for tomcat in one go:
http://pubsub.apache.org:2069/jira/TOMCAT,email/tomcat.apache.org
Public SVN repo topics consist of 'svn', the first one or two path segments after the /repos/ in the URL, and 'commit'.
For example, changes to the repository https://dist.apache.org/repos/dist/release/
have the topics svn/dist/release/commit
.
A commit that involves changes to both dist/release
and dist/dev
has the topics svn/dist/commit
.
Note that svn/dist/release/commit
will not match, because the topics in the response do not include release
.
Private SVN repos topics are constructed in the same way, but have an additional 'private' topic.
For example https://pubsub.apache.org:2070/private/svn/private/committers/commit
returns commits for
https://svn.apache.org/repos/private/committers/board/
Event payload examples
Pings are simple objects like this:
{"stillalive": 1583973410.9620552}
An example of a real event payload, in this case a git commit, could be (emails redacted in this example):
{
"commit": {
"body": "[maven-release-plugin] prepare for next development iteration\n",
"committer": "sblackmon <s...@apache.org>",
"hash": "8e6f956",
"log": "[maven-release-plugin] prepare for next development iteration",
"repository": "git",
"sha": "8e6f956c2eda06ca9debf21634cedcecc96293ff",
"author": "sblackmon",
"files": ["pom.xml", "streams-cli/pom.xml", "streams-components/pom.xml"],
"server": "gitbox",
"project": "streams",
"autopublish": false,
"date": "Wed Mar 11 19:25:06 2020 -0500",
"committed": "Wed Mar 11 19:25:06 2020 -0500",
"subject": "[maven-release-plugin] prepare for next development iteration",
"ref": "refs/heads/master",
"email": "s...@apache.org",
"authored": "Wed Mar 11 19:25:06 2020 -0500",
"ref_names": ""
},
"pubsub_topics": ["git", "streams", "commit"],
"pubsub_path": "/git/streams/commit"
}
Payloads vary depending on what they represent, so check both what sub-objects are present in the payload and the pubsub_path
variable, which will show the full payload event path and explain which type is being sent.
Try it yourself
To try it out and take a look at the event stream, use cURL in your terminal:
curl http://pubsub.apache.org:2069/git/commit
A secure version also exists on port 2070, for use with authenticated event streams:
curl https://pubsub.apache.org:2070/git/commit
Please note that due to limitations in our TLS terminator, payloads larger than 64kb are split into 64kb chunks on port 2070. If you are using port 2070, you should ensure that the data you receive is terminated with a newline (\n), or continue fetching data till you hit a chunk terminated with a newline.
N.B. the following curl switches may be added:
- -N - non-buffered output; used when piping into another command (e.g. tail)
- -sS - silent mode (-s) but still shows error messages (-S)
Using PyPubSub in programming
Using PyPubSub with Python
You can listen for and react to payloads in Python using the asfpy pip package:
import asfpy.pubsub
def process_event(payload):
print("Got an event from PyPubSub!")
...
def main():
pubsub = asfpy.pubsub.Listener('http://pubsub.apache.org:2069/')
pubsub.attach(process_event, raw=True)
Using PyPubSub with node.js
This sample snippet lets you use node.js
for listening for and processing pubsub events:
const http = require("http");
const https = require("https");
class PyPubSub {
constructor(url) {
this.url = url;
this.getter = url.match(/^https/i) ? https : http;
}
attach(func) {
this.getter.get(this.url, res => {
res.setEncoding("utf8");
let body = '';
res.on("data", data => {
// Be mindful of proxies that split pubsub chunks into smaller ones,
// only load the JSON blob once we hit a newline (\n)
body += data;
if (data.endsWith("\n")) {
let payload = JSON.parse(body);
body = '';
func(payload);
}
});
});
}
}
// Test
function process(payload) {
// ping-back?
if (payload.stillalive) {
console.log("Got a ping-back");
// Actual payload? process it!
} else {
console.log("Got a payload from PyPubSub!");
console.log(payload);
}
}
const pps = new PyPubSub('http://pubsub.apache.org:2069/');
pps.attach(process);
Using PyPubSub with Ruby
This sample lets you connect to our pubsub service via Ruby:
require 'net/http'
require 'json'
require 'thread'
pubsub_URL = 'https://pubsub.apache.org:2070/'
def do_stuff_with(event)
print("Got a pubsub event!:\n")
print(event)
print("\n")
end
def listen(url)
ps_thread = Thread.new do
begin
uri = URI.parse(url)
Net::HTTP.start(uri.host, uri.port, :use_ssl => url.match(/^https:/) ? true : false) do |http|
request = Net::HTTP::Get.new uri.request_uri
http.request request do |response|
body = ''
response.read_body do |chunk|
body += chunk
# All chunks are terminated with \n. Since 2070 can split events into 64kb sub-chunks
# we wait till we have gotten a newline, before trying to parse the JSON.
if chunk.end_with? "\n"
event = JSON.parse(body.chomp)
body = ''
if event['stillalive'] # pingback
print("ping? PONG!\n")
else
do_stuff_with(event)
end
end
end
end
end
rescue Errno::ECONNREFUSED => e
restartable = true
STDERR.puts e
sleep 3
rescue Exception => e
STDERR.puts e
STDERR.puts e.backtrace
end
end
return ps_thread
end
begin
ps_thread = listen(pubsub_URL)
print("Pubsub thread started, waiting for results...\n")
while ps_thread.alive?
sleep 10
end
end
Want to know more? Have questions?
To learn more, or just get some questions answered, please contact us at users@infra.apache.org
, and we'll try our best to help you out.
Acknowledgements
PyPubSub is based on SvnPubSub and gitpubsub. We wish to thank the Subversion project for building the precursor to this service.
Copyright 2024, The Apache Software Foundation, Licensed under the Apache License, Version 2.0.
Apache® and the Apache feather logo are trademarks of The Apache Software Foundation.