Thursday December 2, 2010

[Time] NameMessage
[00:19] erickt Hi #zeromq! congrats on 2.1.0. I was wondering, did sys://log make it into 2.1.0?
[00:20] erickt and if so, what is getting logged?
[01:54] Remoun How can I get the remote IP (of the sender) when receiving over TCP?
[04:20] DanielHolth any word on a ctypes zeromq binding?
[04:20] DanielHolth for Python?
[07:32] delaney has anyone had any luck getting pyzmq to build with 3.1?
[07:34] delaney trying with the 2.0.10
[07:53] sustrik delaney: autobuilds:
[07:54] sustrik Remoun: you can't
[07:55] sustrik erickt: it have made it in
[07:55] sustrik you can subscribe to it
[07:55] sustrik but nothing is logged yet :)
[07:59] delaney sustrik that looks cool but doesn't seem to have a link to the zmq package, or is that not available?
[07:59] sustrik nope, it doesn't create packages
[08:00] sustrik but it shows that the pyzmq can be built with master
[08:03] delaney oh i don't doubt it
[08:03] delaney actually just got it working with 32bit visual studio 2008
[08:04] delaney but can't seem to with 64bit visual studio 2010
[08:09] sustrik report the problem on the mailing list then...
[08:09] sustrik i cannot really help myself is a don't have win64
[08:14] Steve-o sustrik, hope you like the bug I found on MSVC :-)
[08:23] sustrik Steve-o: haven't read all emails yet
[08:23] sustrik which one is that?
[08:24] Steve-o ok, end of "encoder hanging in remote_thr tests" thread.
[08:24] Steve-o compiler over-optimisation
[08:25] sustrik uh, you've sent a patch
[08:25] Steve-o it's a workaround, not a real patch
[08:25] sustrik anyway, mark the emails containing patches with [PATCH]
[08:25] Steve-o I have no idea what is a good solution, leave that up to you
[08:25] sustrik otherwise it's pretty easy to forget about it
[08:26] Steve-o I've dumped a Win32 API call in there, it's not useful to commit it
[08:26] sustrik ok, i see
[08:26] Steve-o still working on why linger isn't working
[08:27] Steve-o is it supposed to on pub sockets?
[08:27] sustrik Steve-o: i would say the problem is that OpenPGM doesn't have linger
[08:27] sustrik so the LINGER is set on 0MQ level
[08:27] sustrik when 0MQ pushes all data to OpenPGM
[08:27] sustrik it considers the work done
[08:27] sustrik and exits
[08:28] sustrik which closes the process
[08:28] Steve-o the problem here is 0MQ isn't sending anything to PGM
[08:28] sustrik thus dropping the PGM tx buffers
[08:28] sustrik ah
[08:28] Steve-o it only init's the pgm_socket object
[08:28] Steve-o the destroys it
[08:28] Steve-o core bug
[08:28] sustrik ok, i'll give it a look
[08:28] Steve-o or "feature"
[08:29] Steve-o check your mails and get back to me on the list later
[08:29] sustrik sure, will do
[10:01] PiotrSikora guys, are there any complete examples/guides about integrating zeromq with existing event poll using ZMQ_FD & ZMQ_EVENTS?
[10:20] sustrik PiotrSikora: have a look at src/zmq.cpp
[10:20] sustrik there's implementation of zmq_poll
[10:21] sustrik which uses ZMQ_FD and ZMQ_EVENTS underneath
[10:21] sustrik combined with either select(2) or poll(2)
[10:25] PiotrSikora sustrik_: yeah, i looked at it... let me clarify...
[10:27] PiotrSikora sustrik_: it is my understanding that in order to hook ZMQ into existing event look (kevent, epoll, etc) i need to get existing fd via getsockopt(ZMQ_FD), then when system will notice my about event on that fd, I need to verify that there is complete ZMQ message available via getsockopt(ZMQ_EVENTS)
[10:27] PiotrSikora however i'm lost at how can i retrieve this message without blocking?
[10:28] PiotrSikora zmq_poll? doesn't look like
[10:29] PiotrSikora or would zmq_recv() be ok?
[10:29] sustrik zmq_recv()
[10:29] sustrik you know it's there
[10:29] sustrik so you just call zmq_recv() and you'll get it
[10:30] PiotrSikora ok, thx :)
[10:31] PiotrSikora seems like those new features make it extremely easy to integrate now
[10:31] PiotrSikora i remember i looked into this (integrating with existing event loop) a while ago and it seemd rather impossible to do
[11:09] mikko this is odd
[11:09] mikko out of the box gcc 3.4.6 does not support -fvisibility=hidden flag
[11:09] mikko but redhat "fixed" version does support it but messes up visibility
[11:55] sustrik Steve-o: hi
[11:55] Steve-o hi
[11:55] sustrik aren't there 2 different problems there?
[11:55] sustrik one of them is solved by sleep()
[11:56] Steve-o at least 2 possibly more
[11:56] sustrik yep
[11:56] sustrik so let's solve these seaprately
[11:56] Steve-o the memory barrier / fence is the most odd
[11:57] sustrik mikko: mato says you are right
[11:57] sustrik and it's RH bug
[11:57] sustrik Steve-o: yes
[11:57] sustrik anyway, one problem that's pretty obvious is that 0MQ doesn't wait for OpenPGM when terminating
[11:58] sustrik can we solve that one somehow?
[11:58] Steve-o isn't something broken in the linger implementation?
[11:58] sustrik maybe, but that's not the point
[11:58] sustrik the point is that 0mq doesn't know when openpgm have sent all the data
[11:59] Steve-o ok three problems
[11:59] sustrik ok
[11:59] Steve-o #1 0mq does not flush complete batch of messages, instead it only sends first message
[11:59] Steve-o #2 a MSVC 2010 compiler optimisation bug causes encoder.get_data to hang
[12:00] Steve-o #3 short PUB runs followed by zmq_term do not plug the underlying transport
[12:00] mikko sustrik_: i think i got a feasible patch
[12:01] mikko ill talk with mato when he is back
[12:01] sustrik sure
[12:02] Steve-o note #1 & #3 may also be MSVC bugs, no idea
[12:03] sustrik Steve-o: ok, can it be reproduced on a single box?
[12:03] Steve-o #3 is really easy, just run remote_thr with size=100 and count=1 ... 2000
[12:03] Steve-o i added printf statements to every call in pgm_sender.cpp to see what if ever is called
[12:04] Steve-o #2 randomly occurs when data is actually sent
[12:04] Steve-o #1 not so important as the others
[12:06] Steve-o of course it will be more annoying if it is hardware dependent
[12:06] sustrik so let's start with #3
[12:07] sustrik pgm?
[12:07] sustrik epgm?
[12:07] sustrik loopback?
[12:07] Steve-o epgm,
[12:07] sustrik does it happen on linux as well?
[12:07] Steve-o haven't checked yet
[12:08] sustrik let me try
[12:08] Steve-o i only saw #1 on linux so far
[12:13] Steve-o linux looks fine here for count=1
[12:14] Steve-o LD_LIBRARY_PATH=src/.libs/ ./perf/.libs/remote_thr --rate-limit 100 "epgm://eth0;" 100 1
[12:14] Steve-o sends 1 packet as expected
[12:22] sustrik so it only happens on windows, right?
[12:23] sustrik i don't have a win box here
[12:26] Steve-o appears so, just tested on another box and reproduced it
[12:30] Steve-o that's why I'm wondering if it is another MSVC compiler optimisation bug
[12:31] Steve-o you really need some heavy unit testing to catch annoying features like this
[12:31] sustrik Steve-o: yes, but win32 is a platform that i am not really using
[12:35] Steve-o unfortunately most VMs are too helpful either for multicast testing
[12:37] Steve-o I think it's still only limited to ESX virtual NICs
[12:37] sustrik no idea
[12:38] sustrik anyway, the obvious problem is that the whole win/pgm thing is not going to move forward at any reasonable pace if there's no infrastructure to test it on
[12:43] Steve-o yup, and I guess some assistance in testing the .net libraries would help developers too
[12:46] Steve-o if MSVC is causing this there should be negative consequences on the TCP transport too
[12:50] Steve-o I'm currently using 2008R2 trial on a server and Windows 7 on desktop, but on separate networks :-)
[14:00] mikko mato: found it
[14:00] mikko - backport C++ visibility patches, -fvisibility*, #pragma GCC visibility
[14:00] mikko from RHEL gcc changelog
[14:10] Guthur i there any caveats with having multiple contexts in the one application?
[14:10] Guthur i/is
[14:11] mato mikko: Thought so ... usual silly attitude from RHAT
[14:11] mato mikko: Anyway, I'd suggest going with the approach in your patch for now (don't enable it at all on GCC < 4)
[14:11] mato mikko: I've not reviewed it yet in detail, busy today...
[14:14] mikko mato: check this
[14:14] mikko
[14:14] mikko this is the latest
[14:14] mikko it uses AC_COMPILE_IFELSE to test the visibility
[14:14] mikko the AC_COMPILE_IFELSE should fail if the compiler is not one of the defined
[14:15] mikko additionally it uses 'nm' to check that the symbol actually has the expected visibility
[14:16] mato mikko: 'nm' is not a good idea due to non-portability of it's output
[14:16] mato mikko: e.g. solaris nm by default produces completely different output
[14:16] mato mikko: if you really want to go the whole hog and actually test if -fvisibility works, then you'd need to compile a shared object and try and link against it
[14:17] mato mikko: not sure if that's worth the work...
[14:17] mato mikko: since compiling a shared lib would need to be done thru libtool to be portable, etc etc.
[14:18] mato mikko: why not just stick with the bruteforce approach for now and don't even try -fvisibility on GCC < 4?
[14:18] mikko mato: nm format is defined in posix
[14:18] mikko not sure if everyone follows that htough
[14:18] mato mikko: Yes, but not everyone follows thhat
[14:18] mikko ok
[14:19] mikko ill remove the nm part
[14:19] mikko the AC_TRY_COMPILE should still fail with gcc 4<
[14:19] mikko gcc <4
[14:20] mikko let me update the patch
[14:20] mikko mato: you mean something like this:
[14:22] mato mikko: guess so
[14:23] mikko i can also just remove the AC_TRY_COMPILE and use the compiler check
[14:23] mikko if test "x$ac_ .. etc
[14:23] mato make it as simple as possible, this is just for people with broken redhat GCC
[14:24] mikko ok
[14:24] mikko i'll juggle it a bit later
[14:27] sustrik Guthur: you can, but what is it good for?
[14:28] Guthur To make my object disposal strategy in a C# application a little bit more naive
[14:29] Guthur I have a background thread listening for requests on a ZMQ_REQ, it'd like it to able to take care of it's own disposal
[14:29] Guthur it'd/I'd
[14:30] sustrik and?
[14:30] Guthur Well if i share a context with the main thread I was thinking that I couldn't get rid of it without worrying about that socket
[14:31] sustrik the socket returns ETERM when context it terminated
[14:31] Guthur Does that make sense
[14:31] Guthur oh ok
[14:31] Guthur ignore the make sense statement
[14:33] Guthur my mistake, I should have checked, I thought the open socket would block the context from disposing
[14:34] sustrik it will
[14:34] sustrik you'll get ETERM, then you close the socket
[14:34] sustrik then the zmq_term() finishes
[14:39] Guthur ok, thanks for the clarification sustrik
[17:06] drbobbeaty I'm running with the new ZMQ 2.1.0 from the new downloads site. It's running just fine, and I really appreciate all the work that's gone into it. But there is one thing, and I'm not sure what approach to take. When using epgm:// (OpenPGM) as the transport, the call to send() leaks. Not as much as 2.0.10, but it still leaks. For my application, it's still a problem. Is there anything in the "known issues" list for ZMQ or OpenPGM that might clear this up?
[17:06] drbobbeaty am I on my own with the code?
[17:08] mikko drbobbeaty: do you know where it leaks?
[17:08] mikko im not sure if this is a known issue (first time i hear about it)
[17:10] drbobbeaty I only know that if I comment out the call to send(), the leak goes away. Put it in and it leaks (for me) on the order of a couple of MB every few seconds.
[17:11] drbobbeaty I know it's based on the size of the messages, but I don't have a lot of other information on it. I was going to just get down-n-dirty with the code to try and track this down, but I wanted to ask here to see if this is something already known before I spend a few days on this.
[17:12] mikko sustrik might be able to answer this better
[17:13] mikko drbobbeaty: is it simple to reproduce?
[17:13] drbobbeaty mikko: That's the first thing I'm going to do - make a simple test case and then go from there.
[17:14] erickt does it not leak with the other protocols?
[17:14] drbobbeaty erickt: not sure, going to try that too... just at the very early stages (10 min) of this process.
[17:21] cremes drbobbeaty: it might just be queueing the data in memory; what kind of socket are you using?
[17:24] cremes nevermind... that only makes sense for tcp transport
[17:24] drbobbeaty Every idea is welcome. I'm going to do a lot of digging now and then when I have something concrete I'll send it to the mailing list.
[17:29] mikko it does sound like the data is staying in some buffer
[17:30] mikko drbobbeaty: are you closing the messages properly?
[17:30] drbobbeaty yeah, that was my guess, because I've checked on the message itself, and that's OK -- I make a new one for each send, as I thought I read here that's the "best practices" for sending.
[17:30] mikko you got message init and close for each send?
[17:36] drbobbeaty I'm using the C++ API, and that does those in the zmq::message_t class, yes.
[17:37] drbobbeaty mikko: I'm assuming you're asking about the message initialization and close out. The socket stays open for a "long" time.
[17:38] mikko drbobbeaty: yeah, message init and close. looks like the C++ api closes the message upon destruction
[17:38] sustrik drbobbeaty: are you sure you are not pushing data to 0MQ faster than PGM transfer rate?
[17:40] sustrik default transmit rate is 100kb/s
[17:40] drbobbeaty sustrik: I'm not sure what that rate is. I know I'm pushing about 1000 to 10,000 msgs/sec out on different epgm:// connected sockets (different sockets get different parts of the data set)... and I monitor the 10Gb Ethernet and it nowhere near the limit of the NIC - not even 50%. So I don't think I'm sending it too fast.
[17:41] sustrik then look at ZMQ_RATE socket option
[17:41] drbobbeaty sustrik: I set my default to 200kb/s in the construction
[17:41] drbobbeaty ...of each socket.
[17:41] sustrik ok, and how much do you publish?
[17:41] sustrik if you publish more than 200kb/s then the messages are queued
[17:42] drbobbeaty I'll have to put in better measurement statistics to the logging... right now I look at messages per second, not bytes per socket per second. When I do that, I'll know.
[17:43] sustrik in any case, setting transmit rate to 200kb/s on 10GbE seems overly restrictuve
[17:44] drbobbeaty Yeah, I just upped it to 1Mbps and will try that
[19:35] ngerakines hey folks, I've got a few questions about application design with zmq.
[19:36] ngerakines In my system, i've got several load threads that subscribe to external pubsub streams and they take the messages they receive and funnel them to another thread that acts as a sort of funnel
[19:37] ngerakines that funnel binds a ZMQ_PULL socket for that purpose
[19:37] ngerakines what I want to do now, is create a number of worker threads that request work from that socket using ZMQ_REP
[19:38] ngerakines so is it possible to have that funnel thread support both the PUSH connections from the loaders as well REP/REQ connections from workers?
[19:46] Remoun ngerakines; IIUC, your use case fits into the "Request-Reply Broker" pattern, for which there's the built-in Queue device
[19:48] ngerakines I was reading that and got the impression that messages were pushed to the workers (service b) where in my model, I want the workers to request work
[19:53] Remoun AFAIK, the only way for the broker/funnel to know whether workers are available is that workers request work
[19:55] ngerakines so with that, is there a relatively easy way to create a socket that receives both PUSH and REP/REQ requests and is able to determine if an incoming message is one or the other?
[19:55] ngerakines I haven't use poll much, but I'm thinking that I might have to go that route.
[19:55] sustrik yes
[19:55] sustrik two sockets
[19:55] sustrik poll on them
[19:56] sustrik read messages from both as they become available
[19:57] ngerakines ok, thanks everyone
[22:07] raydeo using 0mq 2.0.10 I have an inproc:// ZMQ_PAIR socket that is being used as communication between 2 threads. I'm getting an error when using zmq_connect to the socket in one thread before the other thread has done the zmq_bind... is this a known problem?
[22:07] raydeo the errno received from zmq_bind is ECONNREFUSED
[22:14] mikko raydeo: it's a known limitation
[22:14] mikko you need to bind before connecting
[22:15] raydeo mikko: that's fine, what would you suggest if I don't have control over the order those threads run? a different socket type, or a mutex?
[22:15] mikko raydeo: it's a limitation of inproc transport
[22:16] raydeo ok, so I'll just need to ensure externally the initialization order... shame :(
[23:41] abrown28 anyone listening want to answer a dumb question for me?