Monday December 6, 2010

[Time] NameMessage
[05:46] Guest53515 i've been reading the docs and i see that you can send a message an basically wait for an ack (reply) back. is there a way to send and timeout in error if a reply has not been received within a timeframe
[05:48] guido_g no, you have to do this yourself
[05:55] Guest53515 thanks.. can the receivee timeout or will it block forever?
[05:56] guido_g there is no timeout paramter for receive, but you can simulate it via poll
[05:58] guido_g *parameter
[05:59] Guest53515 thanks, i was wondering about the poll. i will have a look
[06:00] guido_g i'm not sure about this particular thing, but the guide is normally a good source of information
[08:11] mikko sustrik: there?
[08:21] sustrik mikko: hi
[08:26] mikko
[08:26] mikko xpub/xsub missing from win build files?
[08:28] mikko brb, need to commute to the office
[08:28] sustrik mikko: oops
[08:28] sustrik let me fix that
[08:28] sustrik you are already building with MSVC?
[08:29] mikko yes, testing with win7 build slave
[08:29] sustrik impressive
[08:29] mikko -> linux, solaris10, win7 now
[08:29] mikko brb
[08:29] guido_g btw, are there already any docs xspub/ssub or is it code only currently?
[08:29] sustrik it's been committed yesteday
[08:29] guido_g *docs on xpub/xsub
[08:29] sustrik and does nothing useful
[08:29] guido_g ahhh ok
[08:30] sustrik it's just an infrastructure for subscription forwarding
[08:30] guido_g but sounds great! :)
[08:30] sustrik basically the idea is that subsctiptions are just messages
[08:30] sustrik that are passed up the message distribution tree
[08:44] CIA-20 zeromq2: 03Martin Sustrik 07master * r8a6ff4c 10/ builds/msvc/libzmq/libzmq.vcproj :
[08:44] CIA-20 zeromq2: xup and xsub files added to the MSVC build
[08:44] CIA-20 zeromq2: Signed-off-by: Martin Sustrik <> -
[08:45] guido_g :)
[08:47] sustrik re
[08:48] guido_g wb
[08:50] sustrik :)
[09:13] Steve-o sustrik: so any hope of back pressure on PUB sockets?
[09:13] sustrik what's your problem?
[09:14] Steve-o client sending to PGM faster than the rate limit, there's back pressure from OpenPGM but nothing from 0MQ
[09:15] Steve-o you can set HWM and lose messages, but the API doesn't provide any feedback
[09:15] sustrik Steve-o: don't set the HWM then
[09:15] Steve-o which forces the developer to implement their own rate engine above 0MQ at a guess
[09:16] Steve-o making the granular rate engine in OpenPGM pretty useless
[09:24] sustrik Steve-o: still there? what the problem with leaving HWM infinite?
[09:30] mikko sustrik: builds now
[09:30] mikko thanks
[09:30] sustrik np
[09:30] mikko if this trial works well ill add the win7 as permanent part
[09:30] mikko to the build cluster
[09:31] sustrik that would be great
[09:31] sustrik win platform is notoriously under-tested
[09:31] mikko sadly this build box is 32bit
[09:32] mikko i need to look into the stockpile of old servers if i could find 64bit iron
[09:55] mato mikko: hi
[09:55] mato mikko: you were after me?
[10:00] mikko mato: yeah
[10:00] mikko mato: -C option to tar is not portable it seems
[10:01] mato mikko: -C ?
[10:01] mikko solaris tar accepts -C when creating or replacing archive
[10:01] mato mikko: what is using -C?
[10:01] mikko yes, -C is used when extracting openpgm
[10:02] mikko so, im wondering that should we require gnu tar or chdir before extraction
[10:02] mikko i got a patch for the chdir approach somewhere
[10:02] mato the latter, since it is supposed to work on Solaris I guess
[10:02] mato anything to do with openpgm doesn't have to be portable past MinGW/Linux/Solaris
[10:03] mikko ok, so i'll post the patch as it is when i got time
[10:03] mikko mato: there is experimental win7 in the daily builds now as well
[10:03] mato sure, go for it
[10:03] mikko running visual studio 2008
[10:03] mato mikko: wow, how did you manage that? :-)
[10:04] mato mingw I could grok, but scripting VS builds ...
[10:04] mikko i installed hudson as windows service and it uses msbuild command line tools to run builds
[10:04] mato ah, so it has support for that
[10:04] mato great!
[10:04] mikko the only pain was to install windows using vmware esx console
[10:04] mikko the mouse moves in jumps
[10:05] mato by the way, the guy with rhat gcc3 problems solved his own problem
[10:05] mato so I guess we can leave that alone for now
[10:05] mikko i got a patch for it though
[10:05] mato let's see if anyone else complains that they really need to use broken RHAT gcc3 :)
[10:06] mato well, if you're happy with the patch, send it to the ML.
[11:09] adalrsjr1 hello...
[11:09] adalrsjr1 i need run a zmq java application in machines without zmq
[11:09] adalrsjr1 how i can do it?
[11:10] adalrsjr1 whitout zmq installed
[11:10] adalrsjr1 i using linux
[11:11] mikko adalrsjr1: currently the java binding requires libzmq
[11:11] mikko adalrsjr1: i don't think there is a pure java implementation
[11:16] adalrsjr1 its my problema, i don't have libzmq in these machines
[11:17] adalrsjr1 but i have the libzmq compiled in other pc
[11:17] mikko as far as i know currently you require libzmq with the java binding as it uses jni
[11:26] Steve-o adalrsjr1: if the Linux versions are the same you should be able to copy over the jni & core zmq dynamic libraries
[11:28] Steve-o well, noted for the IRC log :-)
[11:49] sustrik mikko, mato: a suggestion -- shouldn't we limit test_shutdown_stress to something reasonable
[11:49] sustrik like seting up and tearing down 100 connection
[11:49] sustrik so that it would work on any box without running into resource problems?
[11:51] mato sustrik: yes
[11:51] sustrik ok
[11:51] mato sustrik: well, maybe... it does expose problems when run properly
[11:51] sustrik that's nice
[11:52] mato sustrik: but at least on solaris there's a limit of 256 fds per process by default
[11:52] sustrik otoh, the builds fail bacause of it
[11:52] mato sure, but then how will we know we've fixed the problem :)
[11:52] mato maybe...
[11:52] mato how about we change the test to run less connections on !linux
[11:54] sustrik i don't think platform.hpp is included in the tests
[11:54] sustrik do you know you are no linux there?
[11:59] mato sustrik: hang on
[12:00] mato sustrik: a) the tests can obviously include platform.hpp
[12:00] mato sustrik: b) test_shutdown_stress can call getrlimit(3) with RLIMIT_NOFILE and pick some sane value
[12:00] mato sustrik: for the # of iterations
[12:00] mato sustrik: that way we can at least ensure it won't run out of FDs
[12:01] mato sustrik: if it dies due to other problems then that's a valid bug...
[12:01] sustrik what about socket buffer size?
[12:01] mato what about it?
[12:01] mato that's a bug, we need to fix that somehow :-)
[12:02] mato we shouldn't be hiding bugs by modifying test cases
[12:04] sustrik the question is how to distinguish "out of resources" from "bug"
[12:04] mato then our mailbox_t needs to report back that it has run out of socket buffers, and return that to the app somehow
[12:06] sustrik it has to fail; at that point the system is broken beyond any hope to repair
[12:06] sustrik it's basically a same problem as ENOMEM
[12:06] sustrik is it a bug or is it not?
[12:07] mato well, the way we're dealing with it at the moment is a bug
[12:07] sustrik the assertion?
[12:07] mato that, and/or the fact that we're reliant on the socket buffer size so much
[12:08] mato step 1 would be to at least change the assertion to somehow return an errno to the application. if that's possible.
[12:08] mato step 2 is obviously to fix the signaler
[12:08] mato again, if that's possible - i know we've been through this...
[12:08] mato sustrik: think about it in kernel terms
[12:09] mato sustrik: do you panic the system if some buffer runs out? i think not.
[12:09] sustrik the problem is that in this case the buffer in question is part of the essential infrastructure
[12:09] sustrik thus when it overflows the whole thing is unusable
[12:10] sustrik you do panic in such a case
[12:10] sustrik still better than undefined behaviour
[12:11] sustrik anyway, i'll leave the stress test as is for now
[13:26] sustrik drbobbeaty: hi
[13:28] drbobbeaty sustrik: hi
[13:29] sustrik hi, have you seen steve's answer about ZMQ_RECOVERY_IVL smaller than 1 sec?
[13:30] drbobbeaty Yeah, he was talking about making a C-level call to OpenPGM based on the size of the buffer. Since I'm using the C++ interface, I didn't know how/if that would be possible given that I don't have access to the underlying C pointers/structs.
[13:31] sustrik the idea is to tweak the 0MQ source code
[13:31] sustrik ie. set the recovery in number of packets rather than in seconds
[13:32] sustrik see src/pgm_socket.cpp
[13:33] sustrik line 203 and 236
[13:34] drbobbeaty Ah! OK... I can tweak the code if needed. My question would be if this is going to be supported in some manner in the straight ZeroMQ releases. I can wait on this if it's coming out soon, or I can make the changes and then back them out when the feature becomes available from you guys.
[13:35] sustrik drbobbeaty: if you send a patch to the mailing list, i'll apply it
[13:35] sustrik just make sure that it actually works before sending it
[13:35] drbobbeaty he he he... yeah, that'd be a good thing to make sure :)
[13:35] sustrik (i don't have a test env here, so i won't be able to test it really)
[13:35] drbobbeaty I'll have a look and then read up on the ML diff submission process.
[13:36] sustrik sure
[13:36] sustrik have you seen the code?
[13:36] sustrik the change seems to be pretty trivial
[13:36] drbobbeaty I think I can handle it :) But if I have any questions, I will be back to ask for help.
[13:37] sustrik sure
[13:38] sustrik drbobbeaty: ah, damn
[13:38] sustrik change from sec to msecs would break the backward compatibilitty :(
[13:39] drbobbeaty Yeah, I can imagine that... but what about a different named socket option?
[13:39] sustrik yes, looks like the only option atm
[13:40] drbobbeaty Seems fair... do you have a preference for what that name should be? ZMQ_RECOVERY_MSEC or ZMQ_RECOVERY_IVL_MSEC?
[13:41] mikko is that confusing?
[13:41] mikko two constants for effectively same option
[13:41] sustrik mikko: any better idea?
[13:42] mikko it would be still possible to change this in 2.1
[13:42] mikko as there is no stable release
[13:42] mikko beta is what it says on the tin
[13:42] sustrik the backward compatibility is guaranteed witin a major version
[13:42] sustrik so it can't be broken before 3.0
[13:42] mikko but things are already breaking going from 2.0 to 2.1
[13:42] mikko zmq_init for example
[13:43] sustrik how so?
[13:43] mikko or did that change earlier?
[13:43] sustrik yep, that changes somewhere at 2.0.4
[13:43] sustrik since then people have complained about breaking backwards compatibility
[13:43] sustrik so i've written compatibility guidelines
[13:43] sustrik let me find it...
[13:46] sustrik
[13:46] drbobbeaty sustrik, mikko: I am enjoying reading the "Contributing to 0MQ" page... as soon as I get through all this, and you guys decide how you'd like me to implement it, I'll get right on it.
[13:47] mikko sustrik: "It may even run, however, you should read the NEWS file so you are sure that changes made won't affect your application behaviour in subtle ways."
[13:47] sustrik enjoying the burocracy? :)
[13:48] mikko sustrik: it would still compile against the version
[13:48] sustrik mikko: right
[13:48] mikko is your policy says
[13:48] mikko but people using recovery IVL would need to read about the change
[13:48] sustrik but allocating buffer 1000x larger than expected in not really a subtle change
[13:49] sustrik it's pretty dangerous actually
[13:50] sustrik i would go for new socket option now
[13:50] sustrik and normalise the two options into a single one in v3.0
[13:50] mikko i guess that is sensible for now. i think we should keep a list things like these that need to be cleaned up on major
[13:51] sustrik some comments are already here:
[13:52] sustrik drbobbeaty: just add a new option for now
[13:52] drbobbeaty sustrik: OK, that's what I'll do.
[13:53] sustrik something like:
[13:53] sustrik "RECOVERY_IVL_MSEC has precedence to RECOVERY_IVL"
[13:54] sustrik "however, if set to zero, it's ignored and RECOVERY_IVL is used instead"
[13:54] sustrik "default is zero"
[13:54] drbobbeaty sustrik: OK, sounds reasonable with a nice fallback.
[14:10] sustrik Steve-o: what out_buffer_size should I use?
[14:11] sustrik and what parameters you were running local_the/remote_the with?
[14:11] sustrik thr*
[14:12] Steve-o I tried 1420
[14:12] Steve-o and "tcp..." 100 1
[14:12] sustrik great
[14:12] sustrik i'll check that
[14:50] drbobbeaty sustrik: I assume all these changes for the recovery time should be to the default, 'development' release and not the 'maint' release, correct?
[14:51] mikko drbobbeaty: yes, just for the master branch
[15:21] toni hey there. I am using a REQ socket that connects to a set of servers (XREP). Is there way to notice when a server dies? My intention is to wait for max. 2 seconds, until I consider the server as gone. But the s.recv() (s is a REQ socket) blocks. Is there a way to achieve this, or should a make use of a XREQ which is non-blocking?
[15:21] mikko toni__: you should be able to use zmq_poll for this
[15:21] mikko you can also pass ZMQ_NOBLOCK flag to zmq_recv
[15:22] toni cool, thanks. great channel, for all my questions I get a answer very fast. This avoids searching the huge guide twice for such particular part of information. Thanks!
[15:23] guido_g toni__: there is also
[15:26] toni guido_g: i know, but I expected to maybe get a correcting answer in case my solution for the problem would already be provided by zmq itself.
[15:27] guido_g toni__: it was more a response to " This avoids searching the huge guide twice..."
[15:31] sustrik hm, this is almost a FAQ
[15:32] sustrik i've answered this particular question like twice in past two days
[15:44] toni hey there, one more question. I could not find a way to disconnect a socket from an address it was once connected to. I can only find socket.close() but thats not what I need. So is there a way to disconnect from an address?
[15:49] sustrik close the socket
[15:50] toni the socket is connected to a set of addresses. In case one does not answer, I want to remove the connection to this address, so I close the socket and reconnect it to all adresses without the one that did not work?
[15:51] sustrik i presume that when the non answering address becomes available again, you want to reconnect to it?
[15:52] toni yes
[15:52] sustrik 0mq does that for you
[15:52] sustrik you don't have to care about non-anwering endpoints
[15:53] sustrik they are ignored and reconnected when they become available again
[15:53] toni okay, thats great. Does this also mean that messages wont be send to an address that seems currently not available?
[15:54] bobdole369 Hello everyone, tnx for letting me idle in here all weekend :x - didn't really mean to do that... OK on to it: Have a possible project in the coming weeks and am looking at 0MQ as the data transport mechanism.
[15:54] bobdole369 Few queries about a few things come to mine.
[15:55] sustrik toni__: yes, but set HWM to something low
[15:55] sustrik so that requests are not queued too much for a destination that may become unavailable later on
[15:57] toni sustrik: but it can be that a message wont be sent out, and is queued until the address becomes available again? Thats what I have to avoid.
[15:58] sustrik that's done by resending the request once the timeout expires
[15:58] sustrik you should also discard duplicate replies, of course
[15:58] bobdole369 Have a number of embedded devices in the "field" that will transmit data to a central datacenter server. The data are infrequent small data points, perhaps 2kb of data is actually a lot. We control these and can author the packets. Is 0MQ a suitable transport method for this data?
[15:59] sustrik (the whole resend functionality should be actually implemented inside 0mq, but it's not yet)
[15:59] sustrik bobdole369: you want to deploy 0mq on the devices?
[16:00] toni sustrik: Thanks for your help
[16:00] bobdole369 That is possible. They are PLC style devices though, and not PC's. They do speak ansi C.
[16:01] sustrik toni__: you are welcome
[16:01] sustrik bobdole369: what about the OS?
[16:02] bobdole369 On the PLC devices
[16:06] stephank bobdole369: No need to apologize for idling around, that's pretty common practice on IRC. Zeromq is C++ and builds on top of BSD sockets and threading APIs, amongst others. Those are typically implemented by an OS. Are those provided by your embedded platform.
[16:24] drbobbeaty sustrik: I'm looking at the code and can't find any reference to tpdu_size, but I do find references to get_max_tsdu_size(). Should I be using the tsdu_size from this call, or am I missing something obvious? I'm trying to convert either timespan into a count of sequence numbers - as per Steven's suggestion.
[16:29] bobdole369 OIC ya sockets and the API does seem to be done by the embedded platform - M258 Schneider PLC
[16:35] sustrik drbobbeaty: ask steven about what value to actually use
[16:35] sustrik i am not an expert on PGM
[16:35] drbobbeaty sustrik: Got it... will do.
[16:35] sustrik alternatively you may find the definitions in RFC3208
[17:34] bobdole369 I'm a fair bit noob, so can I ask what is the advantage that 0MQ holds over OS calls and sockets?
[17:34] mikko bobdole369: there are several
[17:35] mikko i think personally the biggest advantage is work in terms of messages rather than bytes
[17:35] mikko normally when writing a non-blocking server you have the problem of getting EAGAIN back and then reading a bit more bytes and maintaining state of where the protocol boundaries go
[17:36] mikko another benefit is being able to switch almost transparently between different transports
[17:36] mikko and of course the built-in messaging patterns
[17:36] mikko publish-subscribe, request-reply etc
[17:36] bobdole369 Yes the patterns are mostly what brought me here.
[17:37] mikko there are several other benefits as well
[19:10] jhawk28 sustrik: are you here?
[19:24] delaney are there any up to date C# examples?
[19:29] mikko delaney: the examples are usually pretty portable
[19:29] mikko you should get the hang of C# by looking at here
[20:08] sustrik jhawk28: hi
[21:36] CIA-20 zeromq2: 03Martin Sustrik 07master * rec61751 10/ (src/pub.cpp src/sub.cpp src/xpub.cpp src/xsub.cpp):
[21:36] CIA-20 zeromq2: options.type correctly set for PUB/SUB/XPUB/XSUB
[21:36] CIA-20 zeromq2: Signed-off-by: Martin Sustrik <> -
[21:58] CIA-20 zeromq2: 03Martin Sustrik 07master * r8d6cafe 10/ (10 files):
[21:58] CIA-20 zeromq2: All devices conflated into a single implementation.
[21:58] CIA-20 zeromq2: Signed-off-by: Martin Sustrik <> -
[22:11] CIA-20 zeromq2: 03Martin Sustrik 07master * r73bbcb5 10/ builds/msvc/libzmq/libzmq.vcproj :
[22:11] CIA-20 zeromq2: MSVC build fixed
[22:11] CIA-20 zeromq2: Signed-off-by: Martin Sustrik <> -