Monday February 21, 2011

[Time] NameMessage
[02:54] JStoker Hi. Is there any debian packages for zeromq? :)
[02:56] JStoker Aah. Thanks, I'll try to work out a way of coaxing that into my squeeze install. :)
[02:57] JStoker Ah. Hm.
[02:58] JStoker The debian machine works fine with compiling, I guess. The ubuntu one on the other hand seems to have a bit of a dependency failure, so isn't able to actually compile it itself.
[02:59] JStoker Requires uuid-dev, but "libuuid1 (= 2.17.2-0ubuntu1) but 2.17.2-0ubuntu1.10.04.2 is to be installed"
[03:00] JStoker I just tried `apt-get install uuid-dev`, didn't specify a version :/
[03:02] JStoker I've just forced it to a different version, which has seemed to help... (it's installing, at least)
[03:17] JStoker That's the ub... oh. I just went and built my own package out of git:// Whoops.
[03:17] JStoker seb`, Question. Are 2.0.10 and 2.1.x compatable?
[03:18] JStoker That does report itself as 2.1.1 though, so no issues. :)
[03:20] JStoker oh well. Got the difficult system running with 2.1 anyway, so might as well continue with it on the debian one.
[10:23] CIA-21 zeromq2: 03Martin Sustrik 07master * r5c09311 10/ (src/pgm_socket.cpp src/pgm_socket.hpp):
[10:23] CIA-21 zeromq2: Computation of buffer size for PGM fixed.
[10:23] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[10:23] pieter_hintjens sustrik, I'm seeing zmq_term just exit the process
[10:24] pieter_hintjens does that sound normal?
[10:24] sustrik no
[10:24] sustrik do you have a test case
[10:24] sustrik ?
[10:24] pieter_hintjens trying to make some of the Guide examples work with 2.1...
[10:24] pieter_hintjens yes, I have a test case... I'll email it to you
[10:24] sustrik thanks
[10:25] pieter_hintjens
[10:26] pieter_hintjens also some problems transferring messages but I'll break that down
[10:27] sustrik ok
[10:28] mikko good morning
[10:28] sustrik morning
[10:29] pieter_hintjens hi mikko :-)
[10:34] pieter_hintjens sustrik: you didn't add my log on unroutable messages...
[10:35] sustrik nope
[10:35] pieter_hintjens any particular reason?
[10:35] sustrik it's a standard behaviour
[10:35] sustrik nothing special to log
[10:35] sustrik it's like dropping an IP packet
[10:35] pieter_hintjens i violently disagree
[10:36] pieter_hintjens debugging dropped messages is one of the major pains
[10:36] pieter_hintjens xrep dropping a malformed envelope is not like dropping an IP packet
[10:37] pieter_hintjens it's a sign of application error, and that information is particularly valuable to the developer
[10:37] sustrik yes, we should log malformed envelope
[10:37] pieter_hintjens that is what I had done...
[10:37] sustrik not dropping a well-formed envelope though
[10:37] pieter_hintjens malformed = no such address, sorry
[10:37] sustrik well-formed message i mean
[10:37] sustrik it's on critical path
[10:38] pieter_hintjens errors are exceptional;
[10:38] sustrik if you need it for debugging
[10:38] sustrik we can add some special debug functionality
[10:38] pieter_hintjens when you start to use xrep this is a major pain
[10:38] pieter_hintjens errors -> silence
[10:38] sustrik yes
[10:38] pieter_hintjens the only way I know to debug is to hack 0MQ to print something
[10:38] pieter_hintjens it's a *major* annoyance
[10:38] sustrik it means that the original requester is no longer available
[10:38] pieter_hintjens no
[10:38] sustrik so the reply is dropped
[10:38] pieter_hintjens it means there's a bug in my app code
[10:39] sustrik ?
[10:39] pieter_hintjens i assume you've never written code that uses xrep then
[10:39] sustrik i did
[10:39] sustrik queue device :)
[10:39] pieter_hintjens well, I've done a lot of work using xrep
[10:39] pieter_hintjens and it's tricky
[10:40] pieter_hintjens and the classic 80% error is a bad envelope (valid for 0MQ but where the routing information is wrongly placed)
[10:40] pieter_hintjens 0MQ absolutely must report this
[10:40] pieter_hintjens ask on the list if you're not sure
[10:40] pieter_hintjens it's not a critical path
[10:40] pieter_hintjens there are not millions of malformed addresses per second
[10:40] pieter_hintjens no way
[10:40] sustrik how would you distinguish it from simply dropping the packet because the destination is simply dead atm
[10:40] mikko pieter_hintjens: this is pretty classic as well
[10:40] pieter_hintjens same thing, sustrik
[10:40] mikko might want to feature in the guide
[10:40] pieter_hintjens not critical path, that's the first thing
[10:41] pieter_hintjens this is an exception and can be logged
[10:41] pieter_hintjens mikko: yes, ipc has some gotchas
[10:41] pieter_hintjens i'm trying to debug a case now and the 'silence when dropping messages' just cost me an hour
[10:41] pieter_hintjens bleh
[10:42] sustrik let's add some debugging code then
[10:42] pieter_hintjens sustrik: why? you have a mechanism that's ideal
[10:42] pieter_hintjens as a user I'm asking formally that syslog be used to report bad addresses
[10:42] sustrik the problem is that a node being offline is a pretty common occurence
[10:42] pieter_hintjens it will also help debug topology errors
[10:42] sustrik at least on wan
[10:42] pieter_hintjens 'pretty common' is not a critical path
[10:42] pieter_hintjens wan is not our environment today
[10:43] pieter_hintjens these are bogus arguments afaics, sorry
[10:43] sustrik i'm planning for the future
[10:43] pieter_hintjens you're making it unnecessarily difficult for the present
[10:43] pieter_hintjens at which point the future becomes less certain at all
[10:43] pieter_hintjens if people cannot easily debug routing issues and topology issues they will lose confidence in 0MQ
[10:44] pieter_hintjens if you actually hit this *theoretical* issue of millions of WAN disconnections
[10:44] pieter_hintjens *then* you can solve that problem in the correct fashion
[10:44] pieter_hintjens refusing to log this information today is not a solution
[10:44] pieter_hintjens you would probably want to reduce the logging level on specific sockets
[10:44] pieter_hintjens in 3 years' time
[10:44] sustrik ok, the real problem is that once i add that kind of thing, people start using it to drive their business logic
[10:45] pieter_hintjens it's syslog
[10:45] pieter_hintjens syslog cannot be used for presence
[10:45] pieter_hintjens we know this, it's not a risk
[10:45] sustrik which means the applications would become not scalable outside of a lan
[10:45] pieter_hintjens ?
[10:45] pieter_hintjens sorry, let's kill inproc then
[10:45] pieter_hintjens and multicast
[10:45] sustrik ?
[10:45] pieter_hintjens cause people might depend on that
[10:45] pieter_hintjens and not be scalable outside the lan
[10:46] pieter_hintjens seriously?
[10:46] sustrik nope, you can scale inproc and multicast by replacing them with tcp
[10:46] pieter_hintjens sigh
[10:46] sustrik when you start using logs in your business logic
[10:46] pieter_hintjens how?
[10:46] sustrik you are screwed
[10:46] pieter_hintjens how do you use logs in your business logic?
[10:46] sustrik as the logs are local
[10:46] pieter_hintjens syslog is async
[10:46] sustrik 1. send x to y
[10:47] pieter_hintjens you cannot rely on it for business logic
[10:47] pieter_hintjens we already discussed this on the list
[10:47] sustrik 2. if i get "non routable" error
[10:47] sustrik 3. send it somewhere else
[10:47] sustrik yes, i've discussed it many times
[10:47] pieter_hintjens so people can abuse a tool
[10:47] pieter_hintjens that does not mean you throw away the tool
[10:47] sustrik and each time people asked for this kind of thing
[10:47] pieter_hintjens you educate them
[10:47] sustrik i discussed it with them
[10:48] pieter_hintjens education, not censorship martin
[10:48] sustrik and after some conversation it become obvious thay want to drive their business logic
[10:48] pieter_hintjens you do not actually know what people need
[10:48] sustrik none a single person wanting just logging
[10:48] pieter_hintjens i want just logging
[10:48] sustrik it's 100% misuse rate
[10:48] sustrik 99% then
[10:49] sustrik ok, an idea
[10:49] sustrik what about having a "debug" build which would log this kind of errors
[10:49] pieter_hintjens and if people want to use routing failures in their app, why not?
[10:49] sustrik and release build that won't
[10:49] sustrik ?
[10:49] pieter_hintjens hang on
[10:50] sustrik it doesn't scale
[10:50] pieter_hintjens I'm challenging your assertion that using this information won't scale
[10:50] pieter_hintjens if I make a broker that uses this
[10:50] pieter_hintjens the broker will scale fine
[10:50] pieter_hintjens it's like getting disconnection alerts
[10:51] sustrik wait a sec
[10:51] sustrik let me explain
[10:51] sustrik say you have a server (XREP)
[10:51] sustrik and clients (XREQ)
[10:52] sustrik the server uses sys://log to find out whether clients are disconnected
[10:52] pieter_hintjens yes?
[10:52] sustrik when it finds out, it will do something relevant, like allert the operator or somesuch
[10:52] pieter_hintjens you mean ... log a message somewhere else?
[10:53] pieter_hintjens on the operator console, for example...
[10:53] sustrik i mean, initiate some other business process
[10:53] sustrik like sending the message by DHL
[10:53] pieter_hintjens ? this is your example?
[10:53] pieter_hintjens business logic would be
[10:53] sustrik or placing a message on dlq
[10:53] pieter_hintjens - discover principal connection is dead
[10:53] pieter_hintjens - switch over to backup connection
[10:53] sustrik or running a special repair script
[10:54] sustrik any business logic will do
[10:54] sustrik now
[10:54] sustrik the company grows
[10:54] sustrik and have several offices all over the world
[10:54] sustrik it wants to federate
[10:54] sustrik so they place a queue device at each location
[10:55] pieter_hintjens sustrik, you have device obsession
[10:55] sustrik now imagine an client at point A invoking a service at point B
[10:55] pieter_hintjens people do not, ime, use the standard devices except as a stepping stone while they're learning 0MQ
[10:55] sustrik the connection to A may be perfectly ok, but the client may be dead
[10:55] pieter_hintjens they build their own devices, = brokers
[10:55] sustrik the same thing
[10:56] sustrik the problem is that the "cannot route" error works in hop-by-hop fashion
[10:56] pieter_hintjens i really can't swallow your chain of logic
[10:56] pieter_hintjens it is based on the assumption that all 0MQ applications today will one day scale to the WAN
[10:56] sustrik so the error you get is 'cannot route to next hop'
[10:56] sustrik instead of 'message is unroutable'
[10:56] pieter_hintjens which is true for about 0.001% of cases, maybe
[10:57] pieter_hintjens 99.999% of today's apps (and those written for the next two years) will do what they do, and basta
[10:57] pieter_hintjens that is our constituency
[10:57] sustrik if IP folks would think that way Internet would interconnect the original 3 machines still :)
[10:57] pieter_hintjens trying to stop people writing those apps any way they can is counter-productive
[10:57] pieter_hintjens i'm not a luddite, please
[10:58] pieter_hintjens and you are NOT reinventing the Internet, don't imagine that
[10:58] pieter_hintjens I'm asking for basic tools to make today's problem slightly easier to solve
[10:58] sustrik nope, i'm building a new layer for it
[10:58] sustrik ack
[10:58] pieter_hintjens you ignore the requests of your users at your direct peril
[10:58] sustrik i know
[10:58] pieter_hintjens people will not give much patience to a nerd who tells them "I know your problems better than you do"
[10:59] sustrik there are several options how to solve the problem
[10:59] pieter_hintjens you need people to bring their problems and expertise
[10:59] pieter_hintjens to experiment, even if it means blowing things up
[10:59] pieter_hintjens without that, what you make will be dead and unused
[10:59] pieter_hintjens just a collection of weird names and bizarre theories
[10:59] pieter_hintjens that's 90% of the software universe
[10:59] sustrik i would suggest you fork the stable
[11:00] pieter_hintjens not yet
[11:00] sustrik and apply the patch there
[11:00] pieter_hintjens i cannot run my basic examples on 2.1
[11:00] pieter_hintjens and I'm not going to fork 0MQ... nope
[11:00] sustrik a regrestion?
[11:00] sustrik what happened?
[11:00] pieter_hintjens stabilization = current version + patches from you
[11:00] pieter_hintjens no more or less
[11:01] pieter_hintjens I'd like to request some debugging framework (syslog is frankly quite heavy) that tells me when I make the stupid and repeated error of misconstructing a routing envelope
[11:01] sustrik well, the problem is we have different goals
[11:01] pieter_hintjens we have different timeframes, is all
[11:01] pieter_hintjens i'm actually using 0MQ and finding difficulties with it
[11:01] sustrik how can we possibly synchronise?
[11:02] pieter_hintjens accept the problems your users present you as real
[11:02] sustrik sure they are
[11:02] pieter_hintjens if they take the effort to express them, that means there is real pain involved
[11:02] sustrik i proposed a solution: let's make a debug version of 0mq
[11:02] pieter_hintjens yes, that might work but
[11:02] pieter_hintjens we know why it won't, in practice
[11:02] sustrik #ifdef ZMQ_DEBUG
[11:02] sustrik log (...)'
[11:02] sustrik #endif
[11:02] sustrik easy
[11:03] pieter_hintjens Except that many people provide 0MQ to their own customers
[11:03] pieter_hintjens Now they face the lovely choice of (a) debug version or (b) silent on failure version
[11:05] pieter_hintjens ok, I have a proposal
[11:05] pieter_hintjens i don't actually like the syslog solution at all, for developers
[11:06] pieter_hintjens it's a pain to use and all it really does is collect output you can print to a log file
[11:06] pieter_hintjens if you try to *use* that information in any way you hit Sustrik's Barrier of "it will not scale to the WAN"
[11:06] pieter_hintjens ack?
[11:07] pieter_hintjens it can work internally as a log collector
[11:07] pieter_hintjens so leave it undocumented, and add a method zmq_verbose() that enables printing of syslog messages
[11:07] pieter_hintjens or zmq_debug() or whatever
[11:08] sustrik not bad
[11:08] sustrik same as ZMQ_DEBUG but configurable at runtime
[11:08] pieter_hintjens then when things go weird we can tell users, "run again with the --verbose switch and send me the screen dump"
[11:08] pieter_hintjens yes, has to be runtime configurable
[11:08] sustrik ack
[11:08] sustrik that's a good solution
[11:08] pieter_hintjens IMO there is no performance hit in doing this for all exceptional conditions
[11:08] pieter_hintjens if we do see such a performance issue, we can solve it
[11:09] pieter_hintjens if the syslog collector is embedded in zmq, it can't be abused and it requires no extra work except that one call
[11:14] Guthur is there an optimum zmq_size size?
[11:14] pieter_hintjens Guthur: to achieve what?
[11:14] pieter_hintjens high throughput or low latency?
[11:15] Guthur I was thinking of adding a stream interface to clrzmq2
[11:15] pieter_hintjens what is a stream interface?
[11:15] Guthur and so i was think it would send that as a series of msgs
[11:16] Guthur just passing in a stream to send
[11:16] pieter_hintjens no frames?
[11:16] Guthur and then it would split it up into chunks and send
[11:16] pieter_hintjens if it was video, for example
[11:17] pieter_hintjens you'd want to send one frame per message to get best latency
[11:17] Guthur I haven't thoroughly thought it through it yet, do you recommend framing
[11:17] pieter_hintjens if it was file transfer, for example, you'd use large messages to maximize throughput
[11:17] Guthur early idea genesis stage at the moment, hehe
[11:18] pieter_hintjens then just choose a random value and optimize later when you actually know what the target is :-)
[11:18] pieter_hintjens 4096 bytes per message, there, I've decided for you
[11:18] Guthur oh, ok
[11:18] Guthur hehe, thanks
[11:19] mikko Guthur: do you have tests you want to have run at some point?
[11:19] Guthur mikko: not yet, I hope to sometime
[11:19] Guthur a lot on my plate at the moment
[11:20] Guthur really behind on my MSc thesis, unfortunately
[11:20] pieter_hintjens "life is a buffet of interesting problems"
[11:24] mikko "nothing is more expensive than hiring an amateur"
[11:24] mikko saw that on twitter today
[11:24] pieter_hintjens "If you think a professional is expensive, wait till you see the cost of hiring an amateur" was the version I saw
[11:24] pieter_hintjens I think this would make a good motto for iMatix
[11:25] pieter_hintjens I prefer your short version...
[11:26] mikko
[11:26] mikko there it is
[11:26] pieter_hintjens I need a Latin version
[11:27] Guthur 'If you think training a Graduate is expensive, try an Old Dog'
[11:28] Guthur there's mine, hehe
[11:29] mikko "If you think training a graduate is expensive, you are probably right but it might pay off later. Maybe"
[11:31] pieter_hintjens I'd like to change the text "Explore the Community" on the main welcome page
[11:31] pieter_hintjens it is too bland
[11:31] pieter_hintjens It should be verb article noun
[11:33] pieter_hintjens Escape the Box
[11:34] pieter_hintjens that'll do...
[11:34] mikko reminds too much of thinking out of the box
[11:34] pieter_hintjens that was the intention
[11:35] mikko oh..
[12:17] pieter_hintjens sustrik: how can I tell if a specific commit got into the 2.0.10 release?
[12:17] pieter_hintjens e2167cecaefec6557c7a5712fb75e51487ff69a6 is the one I'm interested in
[12:19] pieter_hintjens ok, it's not there... np
[12:46] pieter_hintjens sustrik: I've found another bug in master
[12:46] pieter_hintjens am porting all the Guide examples to 2.1, some of them do quite strange stuff
[12:47] pieter_hintjens Have logged this (and #168 was the one I found earlier)
[12:50] pieter_hintjens lunch, brb
[12:55] sustrik pieter_hintjens: ok
[12:55] sustrik btw "fathom the basics" is not good
[12:55] sustrik it's a rarely used word
[12:55] sustrik and most non-native speakers won't understand it
[12:56] sustrik same with "grab"
[12:57] sustrik also, nobody will understand "escape the box" leads to the community page
[14:32] pieter_hintjens sustrik: sure, but what does "the community" mean either?
[14:32] sustrik dunno :)
[14:32] pieter_hintjens exactly, it only makes sense when you already know what it is...
[14:33] sustrik development?
[14:33] pieter_hintjens ... something that says, "Get a lot more stuff here..."
[14:33] sustrik yes
[14:33] pieter_hintjens Get the Addons
[14:33] pieter_hintjens ugh
[14:33] mikko addon sounds proprietary
[14:33] sustrik or simple "more"
[14:34] pieter_hintjens from a design basis it's Verb article Noun
[14:34] pieter_hintjens minimalism is quite difficult... :)
[14:34] sustrik true
[14:35] sustrik proceed to more
[14:35] sustrik check more stuff
[14:35] pieter_hintjens Enter the Deep End
[14:35] pieter_hintjens Dive into the Boiling Tarpits of Git
[14:36] sustrik something like that
[14:36] pieter_hintjens anyhow it doesn't really matter if people understand what each option means...
[14:36] pieter_hintjens there are 5 options, clickety-click...
[14:36] pieter_hintjens all we want is their curiosity
[14:37] sustrik we can try to leave it as it
[14:37] pieter_hintjens no-one's going to hesitate thinking, "Oh, I don't fully understand that, therefore I can't click"
[14:37] sustrik what i find really wrong is "fathom"
[14:37] pieter_hintjens sure
[14:37] pieter_hintjens I wanted a little more spice
[14:37] pieter_hintjens Learn the Basics
[14:37] pieter_hintjens but Learn sounds like school
[14:38] pieter_hintjens and "Read the Manual" just sounds like work
[14:38] sustrik read the manual is neutral imo
[14:38] pieter_hintjens hmm, basics vs. advancedics
[14:38] sustrik what about community -> "dive into details"
[14:39] pieter_hintjens I like it
[14:39] pieter_hintjens that contrasts with Basics nicely
[14:40] pieter_hintjens aight, we have a winner, thanks Martin :-)
[14:44] jobytaffey I'm working on a server which multiplexes many TCP sockets into 0MQ. I want a worker app to be able to talk bi-directionally to any TCP socket. What's an efficient way to approach this? One 0MQ socket per TCP socket? or perhaps a single 0MQ socket per direction, aggregating all TCP messages (filtered with ZMQ_SUBSCRIBE)?
[14:45] pieter_hintjens jobytaffey: have you read the Guide?
[14:46] jobytaffey Bits of. I guess I'll go and read some more then...
[14:46] pieter_hintjens this is not actually covered but it will help you understand how to make the 0MQ part
[14:47] pieter_hintjens the answer depends a lot on what kind of work you are doing with the TCP sockets
[14:47] pieter_hintjens i.e. how you map the TCP side to one or more 0MQ patterns (pubsub, req-rep, pipeline)
[14:49] jobytaffey I've got fixed sized packets going in both directions over TCP. I have an online wireless sensor network where I want to be able to register for telemetry feeds from the devices. But, given millions of devices, I don't think I can afford a 0MQ socket per device.
[14:50] pieter_hintjens jobytaffey: please read the Guide, at least Ch1 in detail, and come back when you can map your problem to 0MQ patterns
[14:50] pieter_hintjens otherwise you lack the tools to design this properly
[14:51] pieter_hintjens it sounds like pubsub except you say it's bidirectional
[14:51] jobytaffey I agree I'm trying to run before walking, thanks.
[15:00] fbarriga I don't speak english very good but after a while I could decipher the expresion "Dive into the Boiling Tarpits of Git" but "fathom" is quite weird
[15:01] fbarriga reading the guide is quite amusing but all the decoration make it a bit longer to finish.
[15:02] neale1 fathom == a unit of depth (6 feet ~ 2m). The expression is used to indicate that you getting "deeper" into the subject matter.
[15:03] pieter_hintjens fbarriga: ah, but you did finish it ... :-)
[15:03] pieter_hintjens we had some arguments at the start, about whether or not to decorate the text
[15:03] pieter_hintjens some people complained about it, but most people enjoyed it, so that's the style I used
[15:04] fbarriga pieter_hintjens, nop, I was working on other stuff :(
[15:04] pieter_hintjens for those who prefer their text dry we have the man pages, I guess
[15:06] fbarriga is more 'artistic' the way you write the guide. I think that it sound more intellectual rather than technical.
[15:13] pieter_hintjens fbarriga: well, to be honest, this is how I write when no-one is telling me to conform
[15:13] pieter_hintjens s/telling/paying/
[15:24] fbarriga that's good
[15:49] cremes is this callstack from 0mq sending or receiving a message?
[15:50] neale1 anyone had any experience using 0MQ with Spring?
[15:54] pieter_hintjens cremes: it looks like it's in the zmq_msg_init_size method
[15:55] pieter_hintjens neale1: not as far as I know of, there are some people talking about it but nothing concrete yet
[15:56] neale1 Tks. I have a client trying to use it and having problems. (That's as about as detailed as they have described it unfortunately)
[15:57] cremes pieter_hintjens: right; but i'm wondering if this is 0mq in the act of sending or receiving a message because this isn't code i am calling directly
[15:57] cremes and i'm showing that i have a small memory leak here on occasion
[15:58] pieter_hintjens cremes: hmm, sustrik would know for sure but it looks like an input event, so receiving a message
[15:58] pieter_hintjens neale1: you obviously need more detail on what the problems are...
[15:59] cremes yeah, i'm thinking this is 0mq allocating its own msg_t for receiving the message envelope/header/routing information for an req/rep pair
[16:03] sustrik cremes: receiving
[16:04] cremes sustrik: any chance the lib is not calling close on the message envelope (which isn't passed to the user)?
[16:04] sustrik cremes: how do you know there's a memory leak?
[16:04] sustrik sure, there can be a bug
[16:04] sustrik can you be more specific about the leak?
[16:04] cremes on osx there is a program called 'leaks' that can read the heap and use a malloc flag to find leaks
[16:04] cremes yes
[16:05] sustrik ok, so it maps malloc to frees
[16:05] sustrik and once the program exits
[16:05] sustrik it'll print out the leaks
[16:05] cremes 'leaks' is outputting that i am leaking 112 bytes periodically
[16:05] sustrik right?
[16:05] cremes nope, it analyses a live heap
[16:06] sustrik how can it possibly know the chunk is not referenced anymore?
[16:07] pieter_hintjens it's osx... apple... magic... :)
[16:08] cremes sustrik: login to my box with your account and run 'man leaks'; read the first paragraph
[16:08] sustrik let me see...
[16:08] cremes also, reload this gist to see the output from leaks for my program:
[16:10] cremes the receiving socket in this case is an xreq
[16:11] sustrik ok, it scans whole memory for the pointers
[16:12] cremes yes; if you keep reading, you'll see the note about MallocStackLogging for getting the stack where the leaked memory was allocated
[16:12] cremes s/stack/callstack/
[16:13] sustrik the stack i see in the output it
[16:13] sustrik is
[16:13] sustrik thread_start | _pthread_start | thread_routine | zmq::kqueue_t::loop() | zmq::zmq_engine_t::in_event()
[16:13] sustrik that's different from the one you've posted before
[16:13] sustrik is it truncated or what?
[16:14] cremes you need to scroll the gist to the right; for some reason it doesn't wrap this output
[16:14] sustrik ah
[16:16] sustrik well, there are 3 possibilities:
[16:16] sustrik 1. bug in 0mq
[16:16] sustrik 2. client program not closing the message
[16:16] sustrik 3. the pointer is actually stored outside of process memory
[16:17] cremes i don't think it's 2 because i loop over all messages and call zmq_close() on each part when i'm done
[16:17] cremes there's no way for my logic to skip that step
[16:18] cremes though it's certainly possible my code is the cause
[16:18] sustrik ok
[16:18] sustrik then it's either 1 or 3
[16:18] sustrik 3 happens when pointer to memory is actually held inside of socketpair buffer
[16:18] sustrik which resides in kernel space
[16:19] sustrik at least i would assume it does, even on osx
[16:19] cremes i am *only* seeing this on the one client program i have using an xreq socket
[16:20] cremes everything else is using req,rep,pub,sub and i don't see the leak
[16:20] cremes so i am wondering if this is specific to xreq
[16:20] sustrik ack
[16:20] cremes you can see from the 'context' hex that it contains my custom identity so i assume this is routing info
[16:21] cremes which xreq is supposed to cut off before passing the messages to me
[16:21] sustrik is it a head version?
[16:21] sustrik or 2.1.0 or 2.0.x?
[16:22] cremes hang on, let me get the git commit hash (oh, 2.1.0)
[16:22] sustrik ok
[16:22] cremes e94790006ea6f4c64c commit from feb 9
[16:22] cremes i'll update to latest & greatest
[16:25] neale1 pieter_hintjens: Yeah I was just going to see if anyone had done it so (a) I know it could be done and (b) they might have a "recipe" for doing it and I could pass that wisdom on
[16:25] sustrik cremes: the application in questio; is it using req or xreq socket?
[16:25] cremes xreq
[16:26] sustrik ok, so you handle individual message parts by hand
[16:26] sustrik are you closing all of them?
[16:26] sustrik identity/delimiter/body?
[16:27] cremes yes
[16:27] cremes btw, i see this with the latest master too
[16:28] cremes all message parts are passed to my read callback
[16:28] cremes and i call: messages.each { |message| message.close }
[16:29] cremes it's iterating over the array of message parts calling close on each (zmq_close is called inside of the close() instance method)
[16:29] sustrik what language is that?
[16:29] cremes does xreq pass the identity and delimiter messages up the stack?
[16:29] cremes ruby
[16:29] sustrik cremes: yes
[16:29] cremes i thought only xrep saw all that detail
[16:30] sustrik same with xreq
[16:31] sustrik are there connections being created and torn down during the test?
[16:33] cremes yes, these connections can and do go away (socket is closed via zmq_close())
[16:33] cremes btw, i am printing everything this callback is receiving to my log
[16:34] cremes it does get the nul delimiter message but it does not get the identity message before it
[16:34] sustrik right the identity is used and stripped off by the xrep socket on the other side of the connection
[16:34] cremes that confirms what i originally thought; xreq strips that off
[16:34] sustrik xrep
[16:35] sustrik so the identity is passed to the xreq only on a single occassion:
[16:35] sustrik on connection initiation
[16:35] sustrik let me check the corresponding code...
[16:36] cremes i thought it worked opposite; xrep gets identity from xreq during connection
[16:36] cremes xreq doesn't actually send its identity after that; xrep prepends it locally
[16:37] cremes so when sending from xreq, it goes nul delimiter + message parts
[16:37] cremes xrep recvs routing info (if any intervening hops) and pushes its peers identity on the top of that stack + delimiter + message parts
[16:38] cremes then when it replies, it just sends that routing info stack + nul + parts
[16:38] cremes and ultimately xreq (at original source) recvs just nul delimiter + parts because each intervening xreq stripped off one level of routing info
[16:38] sustrik except of the first part of the route stack
[16:39] sustrik which it have already used to route the message back
[16:39] sustrik to the xreq
[16:40] cremes why would the last hop need to send that routing info to the originator? that socket already "knows" their own identity!
[16:40] sustrik that's why it stips it off
[16:40] sustrik i short
[16:40] sustrik xreq doesn't mess with routing info at all
[16:41] sustrik xrep adds one peers identity to the stack on message receival
[16:41] sustrik and strips one identity from the stack on send
[16:41] pieter_hintjens s/xreq/dealer/g, s/xrep/router/g, it'll be much easier
[16:41] sustrik i think i see the leak
[16:41] pieter_hintjens dealers are just like push + pull
[16:42] cremes right, then i think we agree; the *last* xrep to reply to the original xreq will *strip* off its identity, so it only sends the delimiter + parts
[16:44] sustrik cremes: i've pasted the patch via irc to you directly
[16:45] sustrik let me know whether it helps
[16:45] pieter_hintjens sustrik: thanks for your help with those two issues btw
[16:45] sustrik np
[16:45] pieter_hintjens i have a third one which I *think* actually might be a 0MQ issue ... :-)
[16:45] pieter_hintjens #169
[16:46] sustrik :)
[16:46] pieter_hintjens not sure what the semantics are for zmq_term and pubsub
[16:46] pieter_hintjens if there are 10 connected subscribers, should they all get the last message?
[16:47] pieter_hintjens assuming publisher sends and then terminates
[16:47] sustrik aren't you just running into async connect issue?
[16:47] cremes sustrik: success!
[16:47] pieter_hintjens it's a synchronized pubsub example
[16:48] sustrik cremes: great
[16:48] sustrik let me apply the patch
[16:48] pieter_hintjens subscribers explicitly tell publisher when they are present
[16:48] cremes sustrik: if you want to do an occasional 'leaks' check on osx, feel free
[16:48] sustrik ack
[16:49] cremes it's really easy... # MallocStackLogging=1 ./my_program
[16:49] cremes leaks <pid of my_program>
[16:49] pieter_hintjens sustrik: I'll test if it's due to async connects... hang on...
[16:50] CIA-21 zeromq2: 03Martin Sustrik 07master * r0eea935 10/ src/zmq_init.cpp :
[16:50] CIA-21 zeromq2: Fix for memory leak caused by long identities
[16:50] CIA-21 zeromq2: Signed-off-by: Martin Sustrik <> -
[16:50] sustrik cremes: done
[16:50] sustrik and thanks for the offer
[16:51] sustrik cremes: btw, it was you who said that SO_SNDBUF/SO_RCVBUF on OSX is measured in kB rather than bytes, right?
[16:51] cremes sustrik: yeah, that's the way it looks to me but i can't find that documented anywhere
[16:51] cremes it's screwy
[16:52] sustrik the interesting part is that you've mentioned that getsockopt(SNDBUF) returns 0
[16:52] sustrik so it's not obvious how to even find out
[16:53] cremes ack
[16:53] cremes if you want to write a small c program that exercises that stuff, that's probably the best way to know "for sure"
[16:53] sustrik what kind of kernel is that btw
[16:53] sustrik proprietaty?
[16:53] cremes i could also ask about it on apple's dev lists
[16:53] cremes nope, it's open source
[16:54] cremes it's called darwin... it's a mach + freebsd hybrid
[16:54] sustrik then try asking on the list
[16:54] cremes ok
[16:54] sustrik the functionality is obviously misbehaving
[16:54] sustrik so it would be nice to know what the devs have to say about it
[16:54] cremes any chance you could provide a small c program that illustrates the issue?
[16:54] cremes code always talks louder
[16:54] sustrik i can try
[16:54] sustrik wait a sec
[16:54] cremes especially when they can repro it :)
[16:55] cremes maybe get/setsockopt only screw up on socketpairs
[16:55] sustrik yes, that's my thinking as well
[16:56] sustrik btw, POSIX doesn't specify the unit :_
[16:56] sustrik :)
[16:56] sustrik it just says "buffer size"
[16:56] cremes heh
[16:56] cremes yeah, i can't imagine get/setsockopt are broken for all sockets... that would be *very* obvious
[16:56] sustrik Same with Stevens' book
[16:57] sustrik anyway, let me write an example
[16:57] cremes k
[17:02] pieter_hintjens sustrik: you were right...!
[17:02] pieter_hintjens 0MQ is just too fast, all the 1M messages get broadcast even before the clients connect...
[17:03] sustrik heh
[17:03] pieter_hintjens it makes it quite a challenge to synchronize subscribers and publishers then...
[17:03] pieter_hintjens not sure the worked example is even valid
[17:06] sustrik i am not sure that's it even possible
[17:06] sustrik it's like synchronising a radio show
[17:06] sustrik with all listeners
[17:07] sustrik there still me a letter from a listener in amazonia asking for postponing the show
[17:07] sustrik stuck somewhere in post office at manaos
[17:08] sustrik if you know the number of listeners in advance, it can be solvable
[17:13] sustrik cremes: i've created a socket pair on OSX
[17:13] cremes ok
[17:13] sustrik tried to getsockopt the SNDBUF and RCVBUF
[17:13] sustrik both result in 3,000,000
[17:13] sustrik where have you seen it returning a zero?
[17:14] cremes in mailbox.cpp ::send, that buffer expansion code will return 0 or 32 depending
[17:14] cremes zmq::mailbox_t::send around line 160
[17:15] sustrik strange
[17:15] cremes line 170 usually fails to return a sane number
[17:15] sustrik meybe setsockopt uses different units than getsockopt?
[17:15] sustrik let me try
[17:16] cremes btw, that 3 million number is bytes and i set it in my /etc/sysctl.conf if you want to look at that
[17:16] sustrik that seems to be ok
[17:16] cremes that's the sysctl for local communications which is what i guess socketpair uses
[17:16] sustrik no need checking
[17:16] sustrik the problem is somewhere further down the way
[17:21] sustrik resizing on osx works better on osx than on linux :)
[17:21] sustrik i resize to 100,000
[17:21] sustrik i check the size
[17:21] sustrik i get 100000
[17:21] sustrik when i do same with linux i get
[17:21] sustrik 200000
[17:22] cremes heh
[17:22] cremes try running that stress test... that usually blows it up
[17:27] pieter_hintjens sustrik: something to think about for the future, if we can make synchronous connects
[17:27] pieter_hintjens it'd add real value IMO
[17:27] pieter_hintjens for certain use cases at least
[17:28] cremes pieter_hintjens: that could be provided by an add-on/wrapper library
[17:28] pieter_hintjens nope, not afaics
[17:28] pieter_hintjens i'm hitting this problem in one of the examples
[17:28] cremes why, because connection status isn't exposed?
[17:28] pieter_hintjens i use a req/rep dialog to synchronize the two peers, then a pubsub dialog for the data
[17:29] cremes sounds like ftp
[17:29] pieter_hintjens but I can't get the two synchronized
[17:29] pieter_hintjens because the pubsub connect can take any arbitrary time
[17:29] cremes hmmm
[17:29] pieter_hintjens so even if the req/rep dialog says 'ready', that doesn't mean the subscriber will get data
[17:29] cremes right
[17:29] pieter_hintjens the only sure way is that the pubsub dialog explicitly confirms the connection, if over a connected transport
[17:30] cremes so, you bind the SUB socket, the req/rep says ready, you connect the PUB and it fails?
[17:30] pieter_hintjens yup
[17:30] cremes sounds like a bug
[17:30] pieter_hintjens doesn't fail, just doesn't connect in time to get any data
[17:30] cremes wait, which one doesn't connect in time to get data?
[17:30] pieter_hintjens It's this one:
[17:31] pieter_hintjens i'd like a handshake between publisher & subscriber at connect time, which is exposed to the app
[17:32] pieter_hintjens so subscriber sends identity, and publisher acknowledges, and app code can wait for that to complete
[17:32] pieter_hintjens optionally
[17:32] cremes pieter_hintjens: this is why i use devices so much
[17:32] cremes put a forwarder in the middle and it will probably work
[17:32] cremes the pub connects to the forwarder as do the subs
[17:32] pieter_hintjens i'll have the same issue from forwarder to subscribers
[17:32] cremes when your req/rep gives the all-clear, let 'er rip
[17:33] cremes well, if zmq_connect() takes an arbitrary amount of time, that seems like a 'broken' contract to me
[17:33] pieter_hintjens it's all asynchronous, adding more steps may introduce enough delay, but it's not certain
[17:33] cremes right, i see
[17:34] pieter_hintjens sustrik: would you consider a handshake for new connections?
[17:34] pieter_hintjens i.e. C: identity frame, S: identity ack
[19:16] sustrik cremes: still there?
[19:16] cremes sustrik: yes
[19:16] sustrik git doesn't seem to be installed on your box
[19:16] sustrik how do you get the sources?
[19:16] cremes add /opt/local/bin to your $PATH
[19:17] sustrik ok
[19:17] sustrik works!
[19:17] sustrik thanks
[19:17] cremes you're welcome
[19:19] sustrik btw, have you seen the shutdown stress to fail with head?
[19:20] sustrik doesn't fail for me
[19:26] cremes sustrik: i haven't seen it fail since i boosted my buffers to 3MB (that 3 million number you saw earlier)
[19:26] cremes if i move back to the defaults, i'm fairly certain it will fail
[19:26] sustrik aha
[19:26] sustrik can you do that?
[19:26] cremes not at the moment... i'm wrapping up some work
[19:26] cremes i can do that tomorrow
[19:27] sustrik ok, np
[19:27] sustrik just ping me then
[19:28] ljackson did I read the documentation correctly that in a multithreaded app the zmq context is common to all sockets bind and connect ?
[19:28] sustrik yes
[19:29] ljackson k, if you have a embeded queue device with workers in threads connecting to inproc
[19:29] pieter_hintjens ljackson: only necessarily if you're using inproc
[19:30] ljackson can you with the same context in another thread add stuff to the queue via another inproc ?
[19:30] ljackson pieter_hintjens, good to know.
[19:30] pieter_hintjens ljackson: the semantic for communicating between threads is 'send a message'
[19:30] pieter_hintjens you can send a message to the queue device from any thread, obviously
[19:30] ljackson what I am seeing is zmq hanging trying to connect/bind must be something i messed up then
[19:31] ljackson basically as a test tring to take mtserver.cpp and put the client internal all on inproc using same context is this even valid ?
[19:31] ljackson obviously
[19:31] ljackson diffrent inproc:// addresses
[19:31] ljackson workers for threads as example has and then using queue for clients
[19:31] pieter_hintjens it should work
[19:32] ljackson humm
[19:32] ljackson ok will keep digging or write example test code if I can get it to work and ask again here
[19:32] ljackson thx
[19:32] pieter_hintjens if your code's short, post it to a gist so we can look at it
[19:33] ljackson yeah might have to write an example code for my own sanity anyway if I can reproduce my issue I will post and ask again here
[19:34] ljackson i also believe I read that the order of connection bind vs connect doesn't matter? Or is that in only certain socket types ?
[19:35] ljackson worried that the device thread is not binding before the workers connect
[19:35] ljackson ...etc.
[19:35] pieter_hintjens for inproc it matters
[19:35] pieter_hintjens you must absolutely bind, then connect
[19:35] pieter_hintjens let me send you a new version of mtrelay that shows how I do this in 2.1
[19:35] pieter_hintjens it's somewhat changed
[19:36] ljackson nice thx, I am using 2.1.1 from git
[19:36] pieter_hintjens
[19:36] pieter_hintjens the trick is to bind and connect a socket pair in the parent thread, then pass the context & socket to the child thread
[19:37] pieter_hintjens i will try to make a simple abstraction for this, it's a very common pattern
[19:37] pieter_hintjens specifically for inproc multithreading that is never intended to be scaled out
[19:39] ljackson humm
[19:39] ljackson i thought you were never to share/send the socket ?
[19:42] pieter_hintjens in 2.1 this is legal, and extremely useful for inproc/multithreading
[19:42] pieter_hintjens not shared, just sent
[19:49] ljackson ahh i get it so you know for sure that it was done in the correct order
[19:50] ljackson then you send on the pointer and forget you knew about it
[19:50] ljackson k
[19:51] ljackson pieter_hintjens, so you need to bind a new socket for each worker thread then
[19:51] ljackson in REQ/RES...etc.
[19:52] pieter_hintjens yes
[19:55] amacleod What 0MQ pattern is best for an asynchronous dialogue, where 2 participants can send messages but there doesn't need to be a 1:1 request/response correspondence?
[19:55] amacleod Can XREQ/XREP do that?
[19:56] ljackson pieter_hintjens, thread_args_t *child; child->socket = new zmq::socket_t(*context, ZMQ_ZXREP); ?
[19:57] cremes amacleod: yes, those sockets are the perfect choice
[19:57] pieter_hintjens ljackson: what's the question?
[19:57] amacleod Okay, cool. My next step is to understand how to use identity addressing, then.
[19:57] pieter_hintjens amacleod: have you read the Guide yet?
[19:57] ljackson c++ vs your example
[19:57] amacleod pieter_hintjens, parts of it.
[19:58] pieter_hintjens ljackson: you're asking whether the C++ is correct?
[19:58] ljackson as in the new pointer .. yeah nevermind answered my own Q
[20:05] amacleod So, I noticed when using the Java bindings, I cannot seem to interrupt a recv operation that's trying to read from a socket whose other endpoint doesn't exist.
[20:05] amacleod If I want to preserve Java interruptability, should I be using NOBLOCK and a poller?
[20:16] fbarriga I have a little problem with python
[20:16] fbarriga I can't receive an structure from C++
[20:16] amacleod fbarriga, what format are you using to serialize structures?
[20:17] fbarriga I don't know why It tries to receive it like a string
[20:17] fbarriga in C++ raw structure
[20:17] fbarriga in python struct.unpack
[20:17] fbarriga in the reverse way it works
[20:17] fbarriga msg = self.socket.recv()
[20:18] fbarriga len (msg) it prints 29
[20:18] fbarriga and I'm sending only a double
[20:19] fbarriga double nav = 1; memcpy(zmq_msg_data(&msg), &nav, sizeof(double));
[20:19] amacleod Can you do a byte-by-byte comparison of what you're getting on the wire and the result of struct.pack with the value you expect?
[20:21] amacleod On the C++ side you are telling zmq that your packet should be (sizeof double) bytes long?
[20:22] ljackson pieterh, acording to what I read in the docs/guide am i correct in that the PUSH/PULL socket can have multiple pushers and a single puller in the same context e.g. workers and sinks ?
[20:23] pieterh sure
[20:23] fbarriga sorry guys I found it
[20:23] ljackson pieterh, just wanted to make sure I read it right thx :)
[20:23] pieterh ljackson: all socket types except PAIR can be connected 1-N or N-1
[20:23] ljackson or N-N ?
[20:23] fbarriga quite stupid my error, I've 2 streams and I was connecting to the wrong one :(
[20:24] pieterh that's just 1-N and N-1 from two sides
[20:24] ljackson ya
[20:25] ljackson inproc, pull push need the same treatment as req/res ..etc ?
[20:28] pieterh ljackson: rtfm here
[20:28] pieterh i'm adding more detail on the 2.1 mtrelay example but the point is that inproc is not a disconnected transport
[20:34] ljackson k thx
[20:52] amacleod In Java bindings, how do I determine what error happened when recv returns null?
[20:58] CIA-21 jzmq: 03Gonzalo Diethelm 07master * r91da678 10/ (5 files in 2 dirs):
[20:58] CIA-21 jzmq: Use zmq_errno() everywhere instead of errno.
[20:58] CIA-21 jzmq: Set all projects to compile with a Release configuration. -
[21:05] gdan trying to build in ubuntu 10.10: installed: g++,g++ 4.5,gcc-opt, libstdc++6, libstdc++645-dev, *** I am getting what appers to me stl errors, does anyone have a list of reuired libaries?
[21:06] gdan building the libzmq
[21:09] pieterh gdan: I'm searching for the answer...
[21:09] gdan thanks
[21:10] pieterh gdan: do you have build-essential?
[21:11] gdan let me check
[21:12] gdan do not see it, nor do i see it avail in software centr
[21:13] pieterh apt-get install build-essential
[21:13] pieterh sudo apt-get install build-essential, to be accurate
[21:13] pieterh also uuid-dev
[21:14] gdan installing...
[21:14] soren Or: sudo apt-get build-dep libzmq0
[21:14] soren That installs all the build-dependencies of the libzmq0 ubuntu package.
[21:14] soren Handy shortcut.
[21:15] gdan will do...
[21:19] gdan i re-ran configure, now make is running. so far, looks good. thank you
[21:24] pieterh gdan: np
[21:28] ljackson pieterh, odd appears i am close, getting context terminated exception...
[21:29] pieterh main thread terminates the context before other threads are finished
[21:31] ljackson humm
[21:32] ljackson this happens starting up the ZMQ_QUEUE device. even before I get to starting worker threads
[21:32] ljackson QUEUE bridging inproc not tested ?
[21:38] pieterh ljackson: hard to say without code to look at
[21:39] pieterh the queue device is just normal code
[21:39] pieterh find where you're doing zmq_term, print "HELLO" and then check whether that shows before the error...
[21:40] pieterh if you terminate the context, that kills all inproc sockets
[21:42] ljackson yeah not doing any terminate, but a few other things with this existing code... prob need to make test code to try to reproduce
[21:49] ljackson doh I think i see it
[21:51] amacleod Where are the URIs for 0MQ documented? I assumed I should be using tcp://host:port even for my XREQ/XREP dialogue, but the "mamas/papas" examples in the guide use things like "ipc://routing.ipc".
[21:53] amacleod Found it. zmq_ipc(7) man page.
[21:58] pieterh amacleod: zmq_bind / zmq_connect man pages list the transports, each has its own man page
[21:59] amacleod Ok, thanks. So I still want tcp since I want all my sockets to be remote.
[22:00] amacleod I'm still having trouble wrapping my head around the addressing requirements for XREQ/XREP.
[22:01] amacleod So far I have an XREP "server" that listens for connections. I'm able to create an XREQ "client" that establishes a connection and sends a request.
[22:01] amacleod The server sees the request and where it came from (2 message parts: address, payload), and is able to reply.
[22:02] amacleod When I send the reply as (address, null, payload), I can see in wireshark that some data gets sent in the opposite direction on the socket, but my recv call in the client never returns.
[22:02] pieterh amacleod: ... some generalities about the request-reply pattern
[22:02] pieterh in most cases, client is REQ and server is REP
[22:03] pieterh and anything in the middle (e.g. queue device) is XREP--XREQ
[22:03] pieterh that's the basic layout
[22:03] pieterh after that there is the weird stuff
[22:03] pieterh which chapter 3 explains
[22:03] amacleod So to do this "asynchronous dialogue" thing, do I need to have more than 2 participants?
[22:04] pieterh asynchronous means what exactly?
[22:04] pieterh in your use case, I mean
[22:04] amacleod Not lockstep on requests and replies.
[22:04] pieterh so how do the messages flow?
[22:04] pieterh do you have N clients, 1 server?
[22:04] pieterh N clients, N servers?
[22:04] amacleod Yes.. N clients, 1 server.
[22:04] amacleod Bi-directional messages.
[22:05] pieterh can servers send messages to clients who have not sent a message to the server?
[22:05] pieterh or is it 1 request / 0-N responses?
[22:05] amacleod No. There has to be at least an initial handshake.
[22:05] amacleod It's 1-N requests, 0-N responses.
[22:05] amacleod So, the client can still send messages to the server even while the server is sending responses.
[22:05] pieterh can clients pipeline requests?
[22:05] pieterh ah, ok
[22:06] pieterh so your server definitely has to be XREP (which we will rename to ROUTER at some stage)
[22:06] amacleod Okay. That's the way I have it.
[22:06] pieterh your clients always initiate the dialog, and they talk to a single server only
[22:07] amacleod When I get messages in on the server, I add the identity to a routing table in the server object.
[22:07] gdan does anyone know in mono, how in mono (on ubuntu) i reference the libzmq runtime libraries i just built in the project? ( do see that they are in usr/local/lib)
[22:07] pieterh so they do indeed use XREQ
[22:07] amacleod pieterh, that is correct.
[22:07] pieterh if your clients talked to N servers, you'd want to use XREP for them as well
[22:07] amacleod gdan, does mono respect LD_LIBRARY_PATH?
[22:08] gdan don't know. perhaps when i run i specify the path?
[22:08] amacleod pieterh, indeed. One thing I didn't realize at first was that, with XREQ, I did not need to explicitly send the client's identity.
[22:08] amacleod gdan, I know in Java, I have to do something like -Djava.library.path=path/to/jzmq_native. Probably mono has something similar.
[22:08] pieterh amacleod: if you use either XREP or XREQ in your code, you absolutely have to study and learn how these sockets create and use envelopes
[22:09] pieterh it's logical once you know it but it is really not obvious
[22:09] amacleod pieterh, I have been trying to understand a bit by looking at wireshark.
[22:09] gdan i'll look
[22:09] pieterh it's explained in Ch3, look for Request-Reply envelopes
[22:10] amacleod One thing I was confused about in chapter 3 was how XREP and XREQ expect their envelopes. Is it true that XREQ expects envelopes the same way REQ does and that XREP expects envelopes the same way REP does?
[22:10] pieterh no
[22:10] pieterh the name XREQ is highly misleading
[22:10] pieterh it should and will be called "DEALER"
[22:10] pieterh it is exactly equivalent to a PUSH+PULL combination
[22:11] pieterh it deals messages out, and in, without changing them
[22:11] pieterh whereas REP is a terminator that rips open the envelope, hides it, gives you the contents, and sneakily recreates the envelope when you send a reply
[22:11] pieterh they are fundamentally different tools
[22:12] pieterh take a look at the rtdealer example
[22:12] amacleod Ok. What I was thinking a while ago is that I could achieve what I wanted with a pair of REQ/REP sockets, one for each direction. The replies would always be dinky little "yep I got it" acknowledgements.
[22:13] pieterh it will work really easily using XREQ/XREP, don't panic
[22:13] amacleod But I couldn't think of a good way to create both at once. (trying not to panic :-D)
[22:13] pieterh - your clients using XREQ just connect, and send simple messages
[22:14] pieterh - the server, when it receives a message, gives the app the client identity, followed by the simple message, in two parts
[22:14] amacleod So they are similar to the "worker" in rtdealer? (I say this because worker uses XREQ).
[22:14] pieterh - the server, when it wants to talk to client A, sends two parts: identity A, and then simple message
[22:14] pieterh in rtdealer, client is the worker (uses XREQ as you see), server is the main thread
[22:15] pieterh would it help to have a real example of a server and clients working like this?
[22:15] pieterh i'll make one tomorrow, it's late here now
[22:15] amacleod Possibly. I might try translating rtdealer into Java after I get home.
[22:16] pieterh it's not literally what you want, but it shows how to use that socket pair
[22:17] amacleod I thought that I had things pretty much okay as far as you have said. When you say "the server sends two parts: identity A, and then simple message", do you mean that XREP(router) does that for me when I call send, or that my code should do that using sendmore followed by send?
[22:17] pieterh sendmore followed by send
[22:18] pieterh you need to explicitly tell the router socket who to send the message to
[22:18] pieterh it does not actually send the identity
[22:18] pieterh it uses that to decide what client to talk to, then sends the remaining message part(s)
[22:18] amacleod Ok. It's possible there is a bug in my client receiving code. I did see the reply come back on the socket in wireshark.
[22:19] pieterh I think your use case is much better than the one I used for XREP-XREQ
[22:20] pieterh so I'll change the Guide for that...
[22:20] amacleod It's for instant-message style chat. (wrapping XMPP).
[22:20] pieterh indeed
[22:20] pieterh tomorrow sometime I'll have a working example
[22:21] amacleod Ok. I'll stop pestering you so you can sleep :-) If I am able to translate the current example into Java, I'll let you know here tomorrow.
[22:23] pieterh np :-)
[22:41] CIA-21 zeromq2: 03Mikko Koppanen 07master * r98ccff1 10/ builds/redhat/zeromq.spec :
[22:41] CIA-21 zeromq2: Fixes build on at least CentOS 5
[22:41] CIA-21 zeromq2: Signed-off-by: Mikko Koppanen <> -
[23:39] rem7 what is the best way to ensure the delivery of a msg in a PUSH/PULL (something very similar to the ventilator tutorial) ... I sent out about 16Million msgs and my sink recvd 10million.
[23:41] rem7 I was reading that multicast only works on PUB/SUB...?
[23:50] mikko sustrik: thanks!