Thursday June 23, 2011

[Time] NameMessage
[03:01] cryptw is there a way i can configure a publisher socket
[03:01] cryptw to not queue stuff
[03:01] cryptw and just drop the message if there happens to be no subscriber listening?
[03:16] whack cryptw: isn't that the default behavior?
[03:16] cryptw no. my memory usage seems to just grow and grow
[03:17] whack well, I have a pubsub going and when nothing listens, memory usage goes nowhere
[03:24] whack cryptw: check the zmq guide, it says: If a publisher has no connected subscribers, then it will simply drop all messages.
[03:24] cryptw hmm that's odd. my memory usage must be from something else then
[03:24] cryptw thanks
[03:26] whack you can do some heap profiling and such to see where the memory's going
[03:35] michelp cryptw, did you set an identity on the sub socket?
[03:36] cryptw what's an identity? is it that prefix matching thing?
[03:36] cryptw if so, no
[03:37] michelp some explanation is here
[06:23] CIA-32 libzmq: 03Martin Sustrik 07master * rec81f8f 10/ (12 files in 3 dirs): New wire format for REQ/REP pattern ...
[06:23] CIA-32 libzmq: 03Martin Sustrik 07master * r12532c7 10/ (src/pipe.cpp src/pipe.hpp src/xrep.cpp src/xrep.hpp): O(1) fair-queueing in XREP implemented ...
[06:23] CIA-32 libzmq: 03Martin Sustrik 07master * rd137379 10/ (9 files in 2 dirs): Outstanding requests dropped when requester dies (issue 190) ...
[06:53] CIA-32 libzmq: 03Martin Sustrik 07master * r770d0bc 10/ (14 files in 2 dirs): Fix MSVC build ...
[09:11] JasonCA Everyone sleeping? :)
[09:12] eintr nope
[09:13] JasonCA haha
[09:14] sustrik hu
[09:14] sustrik hi
[11:16] lgbr I have many publishers, and many subscribers. I need a message to go from a publisher to a specific subscriber. What pattern is for me? I have the problem of my multiple subscribers can't listen on the same socket
[15:30] ptrb lgbr: PUSH/PULL solves the "only one receiver" problem, but you'll probably need to combine it with something else to solve your full problem
[15:30] ptrb now a question of my own ;)
[15:31] ptrb if I want to use ZMQ to do arbitrary peer discovery, where should I start looking? multicast isn't covered in detail (as far as I can see) in the guide
[15:55] ianbarber ptrb: multicast is pretty straightforward, so there's no so much need. by arbitary peer discovery do you mean the equivalent of broadcast? if you can have a name service, there are some example with mmi (i think that's the name?), a name service for the majordomo protocol
[16:18] ptrb ianbarber: my only requirement is to keep all the logic in my app (er, instances of my app)
[16:19] ptrb ianbarber: but yes, broadcast (I think)
[16:20] ptrb I must profess stunning ignorance in the state of the art. The problem is: start N instances of myapp on the same network. Wait T seconds. All N instances should know about all the other ones.
[16:33] ianbarber yeah, that's pretty common. you either need something that can do broadcast/anycast and rely on being on the same net segment, or you need a central authority or similar
[16:33] ianbarber you can delegate the location of that central authority to another system (e.g DNS)
[16:33] ptrb or DNS-equivalents like Bonjour, sure.
[16:34] ianbarber yeah
[16:35] ptrb I like the idea of pushing the complexity of failover/HA/etc. to a layer that's designed to handle it... but at the same time, I like the simplicity of doing a (at least trivial) implementation of that in my app.
[16:35] ptrb I think "being on the same net segment" is an OK restriction.
[16:36] ianbarber you could use a multicast on a known multicast IP/port
[16:36] ianbarber so you would connect a SUB socket on PGM or EPGM on a known address/port to listen for existance messages, and do similar PUBs
[16:37] ianbarber then occasionally pub your existance, and perhaps the existance of other nodes you know about
[16:37] ianbarber could be a bit chatty, you might have to play with it to get the protocol right
[16:38] ptrb ianbarber: great, that's exactly what I wanted to know. thanks.
[18:24] tartine Hello all
[18:45] sustrik hi
[18:50] mikko hi
[18:52] taotetek sustrik: hey thanks for the retweet
[19:08] sustrik that's your article?
[19:10] taotetek sustrik: on batch acking with push/pull? yes
[19:10] sustrik nice
[19:11] taotetek been talking to Ken about getting the ability to batch ack worked into the zeromq rsyslog plugin which would be wonderful
[19:19] al_nunn Is anyone a C# zmq user?
[19:21] al_nunn i need some help starting out using zmq in C#
[19:22] ssi taotetek: got a link to the article/
[19:22] ssi ?
[19:23] taotetek ssi: oh sure
[19:23] taotetek ssi:
[19:25] al_nunn can anyone point me in the direction of how to use the libzmq.dll within my C# code
[19:25] al_nunn i cannot reference it or anything
[19:25] al_nunn and make use of the objects within the library
[19:25] ssi thanks :)
[19:26] taotetek al_nunn: sorry al, not a c# programmer at all
[19:27] al_nunn no worries, i just cant find anything online about getting started with c# and i really want to get experimenting with the zmq library
[19:27] ssi taotetek: I ran into the same issue you're descrbing... my ventilators were pulling off a jms broker, and they'd suck the entire contents of the queue down before the workers had gotten through one message
[19:27] ssi taotetek: I ended up going to a LRU queue, where the workers REQ "READY" to become eligible for messages, and then get another message every time they ack
[19:39] taotetek ssi: yup. in my case acking every message produces unacceptable performance, so I worked on a way to choose trade offs between durability and performance
[19:40] taotetek ssi: we have another experiment doing similar but with XREP / XREQ
[19:42] ssi I'm sorta surprised it was that much of a performance hit
[19:43] ssi I haven't done any serious tests with mine yet tho
[19:43] taotetek ssi: in my particular use we are doing statistical analysis and a 1% or so error rate is acceptable and doesn't skew the results. so being able to say it's ok to drop 100 mesages or 1000 messages is an acceptable trade off
[19:43] ssi ahh ok
[19:43] ssi mine is a pipelining workflow engine
[19:44] ssi messages in, messages out
[19:44] ssi can't drop any
[19:44] taotetek ssi: we're looking at hundreds of millions of datapoints so 1000 messages isn't a big deal
[19:44] taotetek ssi: but with a straight push / pull we were noticing ~ 60k to 70k messages in the pipeline even with a HWM of 1 because of the internal queueing
[19:45] ssi yeah, that's pretty much my experience
[19:45] taotetek ssi: ah yes, I'd definitely go REP / REQ if I couldn't lose a message
[19:45] ssi except the tipoff was a bit different...
[19:45] ssi I hooked up to a JMS queue with 50 messages in it, all around 15MB
[19:45] ssi and immediately OOM'd
[19:45] ssi my ventilator would happily pull them off one at a time and throw them downstream, and there was no flow control
[19:46] taotetek ssi: ah yes or messages are only around 200 bytes
[19:46] ssi so I'm using the LRU queue as flow control
[19:46] taotetek s/or/our
[19:46] ssi mine has to handle very many small messages as well as fewer large messages
[19:46] ssi it also will soon need to handle very large content, like raw video
[19:46] ssi but I'm going to be doing that with references
[19:46] taotetek ssi: we're currently handling around 500 million to 1 billion messages a day, and that should go up to ~ 3 billion soon
[19:47] ssi looking at using the new version of the framework that's zmq based for a video syndication pipeline
[19:47] taotetek ssi: so had to find the sweet spot as far as durability / performance
[19:47] ssi yeah
[19:47] taotetek ssi: nice!
[19:47] taotetek ssi: I have a friend I work with who used to be a broadcast engineer
[19:47] ssi I work for TBS
[19:47] taotetek ssi: he's probably be really interested in what you're doing
[19:47] ssi most of the stuff I work on is sports data, and we use the legacy version of this framework for syndication of stats data
[19:48] sustrik btw, there's a ticket in bug tracker to build acking directly into push/pull pattern, which would solve this kind of issues
[19:48] taotetek sustrik: yup!
[19:48] ssi we'll get xml feeds in from stats providers, and we use the workflows to cut it all up into formats that our CDNs and frontend websites need and syndicate
[19:48] sustrik no idea when it gets done though :|
[19:48] taotetek sustrik: this was a relatively painless work around for the moment
[19:48] sustrik yes, i know
[19:48] taotetek sustrik: I'll pester Ken and see if I can get him to contribute some code
[19:48] taotetek sustrik: as he's been looking into the queuing / buffering already I know
[19:49] sustrik i can help him with advice if he decides to do the work
[19:49] sustrik the related code is a bit complex
[19:50] taotetek sustrik: excellent. it would be a boon to us as well, so if he thought he could take a stab at it I bet we'd foot the bill.
[19:50] sustrik great!
[19:50] taotetek sustrik: he's unfortuantely out today but I should be talking to him by the latest by next thursday; I'll float the idea of him working on it and see if I can get anything to come of it.
[20:10] tartine question : does a slow node could slower a previos node ( node1--->node2 , node2 have a heavy dutty, is node1 adapting its speed to node 2 ? )?
[20:12] tartine or is the ømq buffer will make the node1 to store in its "zmq stack" and the node1 continue working normally ?