Thursday June 9, 2011

[Time] NameMessage
[02:46] bitcycle Hey all. I'm wondering if someone here has used zeromq with cassandra for persistance?
[02:58] aaronblohowiak the multithreaded programs hang with no output in both c and python on my osx box
[03:01] aaronblohowiak i am using the examples, and have compiled 2.1.7 on my system for linking with the c program
[03:28] aaronblohowiak oh nevermind
[03:28] aaronblohowiak whoops!
[03:28] aaronblohowiak i feel silly
[04:05] michelp i know that feeling
[04:22] aaronblohowiak i spun up dtrace and was like "WHY IS THIS HITTING POLL AND KQUEUE" and then i thought.. oh, maybe it requires a separate client...
[04:22] aaronblohowiak v.v shame!
[04:22] aaronblohowiak i should have read the whole thing instead of copy/pasting the example
[06:17] yukonbob hello, #zeromq
[09:39] sunoano anybody using ? If so, what's your experience? (I am evaluating a zmq-based queue for a Django project so ...)
[09:52] mikko sunoano: no experience
[09:52] sunoano mikko: ok, fair enough :)
[09:52] mikko i am not really a python man
[09:53] mikko so that probably explains why
[11:14] pieterh anyone interested in secure pubsub over 0MQ?
[11:15] pieterh I've started a page to draft a specification for this:
[12:24] Arvicco Hi guys, anyone with good knowledge of ffi-rzmq awake here? ;)
[12:26] mikko Arvicco: if you are in luck cremes might be here
[12:27] Arvicco Actually, I'm not sure it's specific to ffi-rzmq... I'm just using only this binding to ZMQlib
[12:30] Arvicco The thing is, I have a pair of Ruby daemons using pubsub pattern, one binding to ZMQ:PUB socket, another connecting to ZMQ:SUB. When my Publisher goes down, it closes its socket and properly terminates the context.
[12:30] Arvicco However, when Publisher is restarting and trying to rebind, it's getting "Address in use (ZMQ::SocketError)" exception...
[12:31] mikko can you check what state the socket is in ?
[12:31] mikko try with netstat after the pub terminates
[12:36] Arvicco It's TIME_WAIT
[12:37] mikko it doesn't sound like clean termination
[12:38] pieterh Arvicco: libzmq does an SO_REUSEADDR when binding but afair this can still fail if there are clients connected to the socket that don't catch the error and close their sockets
[12:39] pieterh can you check if the error happens when you don't start any subscribers?
[12:39] pieterh you will need to switch to another port, or wait 5 (or 30) minutes for the TIME_WAIT to expire
[12:39] Arvicco yes, my subscriber is still connected to the socket when Publisher leaves. may this be the problem?
[12:40] pieterh normally 0MQ should catch the error and close the TCP socket but it might not do that properly
[12:41] pieterh in C, it seems to work, but could be system dependent
[12:41] pieterh what OS are you on?
[12:43] Arvicco I'm on Windows XP SP3
[12:43] pieterh Right. Can you try killing the subscriber, see if the publisher can then re-bind?
[12:43] Arvicco Hmm, interesting - now I'm not starting Subscriber at all, and still get the same error upon Publisher restart
[12:44] pieterh hang on, publisher and subscriber are both on Windows?
[12:44] pieterh or one Linux, one Windows?
[12:45] Arvicco No, both are on Windows
[12:45] Arvicco right now
[12:45] Arvicco but in future one will be moved to FreeBSD
[12:45] pieterh ok, once the socket is in TIME_WAIT, it's too late, you have to wait for the TCP timeout to expire
[12:46] Arvicco OK, I'll retry in 5 min
[12:46] pieterh can you retest on a different port, start / stop publisher without subscribers, restart, see if that works
[12:46] pieterh then start publisher, start subscriber, stop subscriber, stop publisher, restart publisher
[12:46] Arvicco A, OK - different port
[12:51] Arvicco OK, starting Publisher on a fresh port, no Subscriber. Shutting down, netstat does not show the port at all. Restarting it, same error (?!).
[12:52] pieterh you're sure you restarted on fresh port, not old one?
[12:56] Arvicco OK, seems like I used this port too about 20 minutes ago
[12:57] Arvicco On a new new port, Publisher restarts without any problem
[12:58] pieterh ok, so it's caused by subscribers not properly closing their sockets
[12:58] pieterh I'd log this as a 0MQ on Windows error, it's unlikely to be in ffi-rzmq
[12:59] pieterh I've seen this before but only with clients running on Windows
[12:59] pieterh if the clients are on Linux (or any Unix), this doesn't happen
[12:59] pieterh however it's weird that the server can't rebind, usually that does work on Windows even with misbehaving clients
[13:00] Arvicco Now I've tried "start publisher, start subscriber, stop subscriber, stop publisher, restart publisher" routine and this works too
[13:01] Arvicco But my use case requires Publisher restarts without Subscriber shutdown. Subscriber is a logger service, actually
[13:05] Arvicco Actually, there is more - my Publisher is opening other (non-ZMQ) TCP ports to receive data from local service. So, if I STOP the Publisher and THEN kill this service, then Publisher could restart even if the Subscriber was still hanging around...
[13:07] Arvicco Seems like there is some kind of interplay between normal TCP sockets and ZMQ-powered sockets to me
[13:18] Arvicco OK, moving Subscriber to other machine made it work... Seems like all kind of weird stuff happens with TCP sockets on Windows - too bad I cannot move away from it completely.
[13:21] Arvicco Thanks for your help, pieterh, mikko!
[13:57] pieterh Arvicco: np
[21:06] CIA-76 jzmq: 03kwo 07master * r56bd495 10/ (3 files): consolidated embedded library logic into one class -
[21:06] CIA-76 jzmq: 03Gonzalo Diethelm 07master * re4a6f1f 10/ (3 files): Merge pull request #51 from kwo/develop ...