Thursday December 9, 2010

[Time] NameMessage
[00:00] oxff how can i get all FDs from a zeromq socket?
[00:00] oxff say, i want to embed zeromq into my existing libev app
[00:01] oxff seems that the only way to use zeromq asynchronously is to develop around zmq_poll
[00:04] cremes oxff: read up on ZM_FD and ZM_EVENTS
[00:04] cremes that is part of the 2.1 release (in beta) which makes it easy to use with other libraries
[00:04] oxff ah thanks
[00:04] cremes and zmq_poll is built on top of those primitives
[00:05] oxff wait it only gives me one fd per zmq_socket?
[00:05] oxff how does it work with a bind socket?
[00:06] oxff or multiple connected sockets?
[00:07] oxff does it create a fake fd/
[00:07] oxff ?
[00:13] oxff looking at tcp_listener.cpp, it only returns the bound fd in get_fd
[00:13] oxff too tired to read all of the source
[00:19] oxff wait, does zeromq internally use threading / create threads for accepting new connectins etc?
[00:22] oxff how can i issue a non-blocking connect with zmq_connect?
[00:28] oxff lol you
[00:28] oxff you're threading and the fd is just some pipe fd you accumulate to?
[00:28] oxff this is bullshit
[01:50] testuser hello
[08:46] rrg hi
[08:49] rrg i am unsure how to implement multiplexed sending using a PUB socket. to be specific: a changing number of producers write multipart messages of varying length, at verying times and from varying threads. do i need to synchronize the writing somehow and abandon the multipart messaging alltogether?
[08:54] Steve-o give each thread a 0mq socket then ensure each thread sends multi-part atomically
[08:57] Steve-o you could always add extra sockets, i.e. one per session per thread too
[08:58] mikko Steve-o: im getting openpgm compile failure on x86 solaris 10
[08:58] Steve-o only tried opensolaris on intel
[08:59] Steve-o have you updated the compiler flags?
[08:59] Steve-o >>
[09:00] mikko Steve-o: i have
[09:00] mikko sec
[09:01] mikko
[09:01] mikko i was working on getting --with-pgm build to work on solaris10 using sun studio last night
[09:01] Steve-o ok, x86/Solaris is pretty much a dead platform
[09:01] mikko had to remove strict flag because __PRETTY_FUNCTION__ is not declared on strict compliance mode
[09:02] Steve-o that's gcc only ?
[09:02] mikko -Xc mode in solaris studio drops standards incompliant things
[09:02] mikko __PRETTY_FUNCTION__ is "pseudo-standard"
[09:02] Steve-o it should be picking up __func__ instead
[09:03] Steve-o just checked, __PRETTY_FUNCTION__ should only be used with __GNUC__ defined
[09:04] Steve-o I don't even bother with __func__, I think that's only SunPro
[09:04] Steve-o but your error is assembler
[09:04] mikko __func__ is supported by all of them
[09:04] mikko i guess it would be fairly easy to ifdef OPENPGM_FUNCTION or similar
[09:05] mikko
[09:05] Steve-o ok i see the problem
[09:06] Steve-o I assume sun = sparc
[09:06] Steve-o which is usually pretty good considering the other options are awful
[09:06] Steve-o sunpro on linux doesn't define sun
[09:07] Steve-o so what does "sun" mean?
[09:07] Steve-o it's not there for opensolaris
[09:08] mikko sun and __sun are defined for x86 solaris
[09:08] mikko sparc and __sparc for SPARC
[09:08] mikko as far as i know
[09:09] mikko i dont have access to sparc
[09:09] Steve-o sparc is for ultrasparc and older
[09:09] mikko
[09:09] mikko this says that it's for solaris operating system
[09:10] Steve-o it's __sparcv9 for the box I have
[09:10] Steve-o which probably makes __sparc a v8
[09:12] Steve-o ok, have to guess which ones match gcc
[09:12] Steve-o I don't even use them which makes it worse
[09:13] Steve-o they're slower on the hardware I have
[09:13] mikko interesting
[09:13] mikko CC man page says that 'sun' is sparc only macro
[09:14] mikko but that might be from time before solaris studio on x86
[09:14] mikko not sure
[09:14] Steve-o x86 solaris is really old, but yes there is a lot of confusion
[09:14] Steve-o looks like prefetch -> sun_prefetch_read_many
[09:14] Steve-o prefetchw -> sun_prefetch_write_many
[09:15] Steve-o I'll have to switch on my sparc and check
[09:15] rrg Steve-o: the cost of adding a socket is insignificant?
[09:15] Steve-o I think I have a few fixes to back port to 5.0 too anyway
[09:16] mikko Steve-o: i use +w -Xc to compile with sun studio
[09:16] mikko if you want to test
[09:16] Steve-o rrg: 0mq automagically manages the multiplexing, so once one underlying transport is setup it is cheap
[09:18] rrg Steve-o: by setup you mean the context initialization?
[09:18] Steve-o rrg: the IO thread(s) open up the TCP/IPC/PGM sockets in the background
[09:19] rrg ah, ok. so context init = transport init
[09:20] rrg Steve-o: but won't binding the socket work exactly once? in case of tcp/ip unicast?
[09:21] Steve-o i'm going on what the whitepapers on the architecture say :-)
[09:24] rrg Steve-o: ? -> /me ?
[09:27] Steve-o brb, a couple of things working on
[09:29] guido_g rrg: the context has nothing to do w/ a particular transport
[09:34] guido_g icp?
[09:35] rrg ipc
[09:35] rrg guido_g: seems to work!
[09:37] rrg guido_g: ok, i have single thread now which binds a pub socket to tcp and sits in a recv loop on an ipc pull socket.
[09:38] rrg guide_g: plus a certain amount of threads which send multipart messages to thread-specific ipc push sockets which they close when they terminate.
[09:38] rrg guido_g: seems to work fine with 2 context i/o threads. any antipattern i'm using here?
[09:39] guido_g 2 contexts? seems fishy
[09:39] Steve-o mikko: aaagh, that strict mode is annoying
[09:41] Steve-o I'm not even getting "sun" defined
[09:41] toni hey there. I have a XREQ client connected to a set of XREP servers. When one server dies, the client still tries to send messages to it. Is this the default zmq behavior, or am I doing something wrong?
[09:42] Steve-o mikko: so strict mode presumably means only __sun is defined? always the underscore requirement?
[09:43] rrg guido_g: 2 i/o threads, 1 ctx
[09:44] guido_g rrg: i dont think you need 2 io threads, it's a waste of ressources
[09:46] rrg guido_g: is there any usecase for n>1 threads?
[09:47] guido_g rrg: no, not that i know of
[09:48] rrg guido_g: ok
[09:49] rrg guido_g: ty
[09:52] toni I described my issue in detail here:
[09:59] Steve-o toni_: send onto the list too
[09:59] guido_g toni_: but first write a working test, please
[09:59] toni Steve-o: I did this. The working test is also linked
[09:59] guido_g toni_: ahh... and remove the useles recursion
[10:00] guido_g toni_: there're three code snippes that are obviously part of something else
[10:00] toni guido_g: the snippets I linked ?
[10:01] guido_g toni_: what else?
[10:01] toni guido_g: no thats exactly my code and my problem
[10:02] guido_g toni_: why didn't you post the code as it is then? what you posted appears to be 3 unrelated snippets
[10:02] toni guido_g: My aproach is, that when a server does not answer for two seconds, I resend the message
[10:03] toni guido_g: But the XREPServer and the XREQConnect are both used in the Testcase
[10:04] Steve-o mikko: I have trunk working with sun one in strict mode now, will have to try ICC later
[10:06] toni guido_g: in the Testcase I first start one server, than the second, then the third. Whenever one server is not available, my XREQ send method runs into the timeout as it does not get any reply for 2 seconds
[10:07] toni guido_g: and in this case I resend the message
[10:07] guido_g toni_: i know, i can read
[10:09] toni guido_g: What I would need, is to tell the XREQ at XREQConnect to disconnect from the addresses that are obviously broken
[10:10] guido_g toni_: ømq should do that, if not, show a comprehensible working test case and file an issue
[10:13] toni guido_g: Okay, Ill open up an issue. Seems to me as if it s not working properly
[10:14] guido_g toni_: make your test actually working and as small as possible, otherwise it'll be difficult to proof your point
[10:15] toni guido_g: yes I will. Thanks for your hints anyway.
[10:49] toni One question remains. In zmq I can connect a socket to some addresses (that never were bound before), send messages to it and they will be buffered until the addresses become available. So consider one client and two servers. The client connects to both. The messages are loadbalanced between the servers. When one server dies (s.close()) the client will try to send its message to that address. As it is not available they are buffered. A
[11:45] sustrik drbobbeaty: hi
[11:45] sustrik would you do the -1 default, or should i?
[11:47] drbobbeaty sustrik: I'll be glad to. I just need to get a few things done this morning and then hamer it out.
[11:47] sustrik great
[11:47] drbobbeaty I want to build RPMs, etc. to verify etc. So it'll take me just a little bit.
[11:47] sustrik sure, no haste
[11:52] toni I posted my problem here with a minimal testcase:
[11:57] guido_g toni_: does it work if you start both servers and then kill one of them?
[11:59] guido_g toni_: of course, after the client connected to them
[12:02] toni guido_g: no, when I start both servers, then starting the client e.g sending 100 000 msgs and then I kill one server, the client still tries to send to the death one
[12:03] guido_g ouch
[12:04] guido_g sounds really like a bug
[12:04] toni The testcase is quite simple, I hope everyone can reproduce this behavior
[12:10] guido_g jepp, killing one of the servers causes a delay every other request
[12:13] toni guido_g: but actually, zmq should behave different and notice the death server?
[12:13] guido_g should be in the docs
[12:15] guido_g ahhh.. got it
[12:16] guido_g set the HWM for the sending socket to 1, then it works
[12:19] toni guido_g: You are my hero !!!
[12:19] toni guido_g: hanging on this issue for almost 2 day now
[12:19] toni guido_g: Thanks a lot for your help!!!
[12:19] guido_g the reason is -- i think -- the unbound queue for this endpoint
[12:20] toni guido_g: what do you mean by the unbound queue?
[12:20] toni guido_g: As far as I understood the HWM limits the send-buffer of the socket to the size of 1
[12:20] guido_g every endpoint under a ømq socket has a queue for messages to be sent to it
[12:21] toni ah okay, thats what I just called buffer...
[12:21] guido_g not of the ømq socket, but the underlying endpoint
[12:23] toni guido_g: thanks for your help. Great project and great community
[13:38] bobdole369 Hello good morning, is zeromq used often with embedded devices? Specifically Rabbit Semiconductor stuff, or Schnieder (or others) PLC's?
[15:17] mikko bobdole369: it might make sense to ask that question on the mailing-lists to reach bigger audience
[15:21] Steve-o mikko: what strict parameters are you using for ICC? I'll check tomorrow and update trunk for PGM
[15:31] mikko Steve-o: sec
[15:31] mikko -strict-ansi
[15:43] Steve-o taa, will test with 5.1 and backport to 5.0
[15:43] mikko cool
[15:44] Steve-o only OSX needs 5.1 because of the lack of spin locks
[15:44] Steve-o maybe something with Win x64 too but I can't recall
[16:25] mikko sustrik: there?
[17:04] sustrik mikko: hi
[17:05] mikko sustrik: im having issues with ZMQ_QUEUE
[17:05] sustrik yes?
[17:05] mikko it seems that my messages are not being closed
[17:05] mikko it's a bit tricky to reproduce
[17:05] mikko but i got front=pull, back=push device
[17:06] mikko and if i am not consuming the message the memory usage goes up steadily
[17:06] mikko if i then start consumer the memory usage doesn't go down
[17:06] mikko even though messages are flowing
[17:06] mikko i can see using valgrind that memory is held in the messages
[17:06] sustrik are you publishing at full spead?
[17:06] sustrik speed
[17:07] mikko yes
[17:07] mikko is it expected that memory usage never goes down?
[17:08] sustrik well, if receiver is slower than publisher it can happen
[17:08] mikko but even if i stop publisher and start consuming
[17:09] mikko it seems that usage does not go down
[17:12] sustrik that looks like a leak
[17:12] sustrik can you report the problem along with the test program?
[17:13] mikko does this make sense:
[17:13] mikko i generate 10k messages, then turn consumer on/off/on/off
[17:13] mikko until no more messages flow through
[17:13] mikko kinda simulating flaky network
[17:13] mikko and at the end it seems that not all messages get freed
[17:13] mikko but there is no more coming from pipe
[17:14] sustrik how do you know they weren't freed?
[17:15] mikko valgrind show them in "still reachable" memory
[17:18] mikko it might be something in my app as well
[17:18] mikko i will investigate further and open a ticket if it's somewhat reproducible
[17:36] sustrik mikko: how did you shut down the device?
[17:37] mikko sustrik: ctrl + c
[17:37] sustrik let me check the code...
[17:37] sustrik master?
[17:37] mikko yes
[17:37] mikko it could very well be my test code
[17:38] mikko i need a statistics collection over http
[17:38] mikko so im writing a small webserver that distributes the messages to backend nodes for processing
[17:43] sustrik mikko: aha
[17:43] sustrik the devices are not shutting down decently
[17:43] sustrik if (errno == ETERM)
[17:43] sustrik return -1;
[17:43] sustrik what they should do, i guess
[17:44] sustrik is to set LINGER to 0
[17:44] sustrik and terminate via zmq_term()
[18:51] delaney are there any examples of the proper use of socket.RecvAll() for C#?
[18:54] sustrik delaney: i don't think there are much examples for clrzmq2, try asking on the mailing list
[18:54] delaney k
[20:26] sustrik mikko: hm, i was wrong
[20:27] sustrik Ctrl+C actually kills the program, thus not freeing the allocated memory
[20:27] sustrik so valgrind would naturally report leaks
[20:27] sustrik alternative would be to install SIGINT handler and try to clean-up before exiting
[20:45] CIA-20 zeromq2: 03Bob Beaty 07master * rfcfad56 10/ (6 files in 3 dirs): (log message trimmed)
[20:45] CIA-20 zeromq2: Added Recovery Interval in Milliseconds
[20:45] CIA-20 zeromq2: For very high-speed message systems, the memory used for recovery can get to
[20:45] CIA-20 zeromq2: be very large. The corrent limitation on that reduction is the ZMQ_RECOVERY_IVL
[20:45] CIA-20 zeromq2: of 1 sec. I added in an additional option ZMQ_RECOVERY_IVL_MSEC, which is the
[20:45] CIA-20 zeromq2: Recovery Interval in milliseconds. If used, this will override the previous
[20:45] CIA-20 zeromq2: one, and allow you to set a sub-second recovery interval. If not set, the
[20:45] CIA-20 zeromq2: 03Martin Sustrik 07master * ra9d969a 10/ AUTHORS :
[20:45] CIA-20 zeromq2: Bob Beaty added to the AUTHORS file
[20:45] CIA-20 zeromq2: Signed-off-by: Martin Sustrik <> -
[21:41] mikko sustrik_: the thing i was seeing is that the memory just 'sits' there
[21:41] mikko and doesn't go down during when the program runs
[21:46] sustrik that's the OS problem AFAIK
[21:46] sustrik OS doesn't return deallocated memory back to the shared pool
[21:46] sustrik rather the memory is kept by the running process
[21:47] sustrik Solaris claims that it's able to reuse the memory, but my tests haven't confirmed it
[21:49] mikko i shouldn't see that memory in still reachable in valgrind though
[21:49] sustrik right, that's interesting
[21:49] mikko but it might've been an application as i don't seem to be able to reproduce after a couple of hours of refactoring
[21:50] sustrik yes, this kind of problems is pretty hard to track down
[21:52] sustrik if you are able to reproduce in the future we can give it a try though
[21:54] mikko cool
[21:54] mikko the tool feels pretty stable now
[21:54] mikko can push a bit over 3k http requests per second on my virtual machine which seems more than enough
[22:02] sustrik :)