Sunday May 1, 2011

[Time] NameMessage
[06:58] ASY hm... everything has been singing along just fine... suddenly under some load I started asserting in fq.cpp #94 (2.1.3) where array_t::swap() causes vector subscript out of range. I am exchanging data between two inproc PUSH/PULL sockets. The error occurs in zmq::fq_t::activated(reader_t *pipe_) where active is 1 and pipes[] has size of 1...
[06:59] ASY wondering what can be the cause of this...
[07:01] ASY I am sending messages from different threads as follows:
[07:01] ASY zmq::message_t request(e,e->size(),&event_queue::free_callback);
[07:01] ASY socket_req_.send(request,ZMQ_NOBLOCK);
[07:02] ASY in the main thread I do: id = zmq_poll (&poll_items[0], (int)poll_items.size(), 1000 * 1000);
[07:03] ASY wondering if this is thread related.
[07:03] ASY if anyone has any thoughts, please let me know. thanks.
[07:16] ASY hm... upgraded to 2.1.6, can't seem to reproduce yet...
[07:56] pieterh ASY: what version were you using before you upgraded?
[07:59] ASY 2.1.3
[08:59] pieterh ASY: are you using the same sockets from multiple threads?
[09:00] pieterh if you poll in one thread and send/recv in another, it'll crash under load
[09:00] pieterh (if you're working with the same sockets in multiple threads, that is)
[09:11] ASY hm... you know what... I think this is it!
[09:12] ASY i use two sockets, however the sending socket is not protected
[09:13] ASY yup. that gotta be it. gotta place a mutex on the sending function
[09:17] ASY i was perfectly aware of this problem. it is in the docs, but somehow i didn't see it in the code until you said it.
[09:17] ASY thanks.
[10:06] pieterh ASY: if you're using the same socket from two threads, your design is wrong, usually
[10:11] ASY correct, but not in this case I don't think. i have a central function that gets invoked from different async callbacks of different 3rd party libraries. i don't even know which threads this function gets invoked from. this is special circumstances. if these are my threads, however, they will establish a dedicated socket.
[10:19] pieterh what language are you working in?
[10:19] ASY C++
[10:20] pieterh well, I'm no expert in weird async callbacks but IMO if you end up using mutexes to share 0MQ sockets, your design is probably not ideal
[10:21] pieterh I'd try to get all callbacks to a single thread and then use 0MQ inproc sockets between that and other threads
[10:21] pieterh sorry to just offer generalities... anyhow, glad we found the cause of the crashes
[10:23] ASY the reason I have a central (now mutexed) function is that it relays events to a central event queue using one inproc socket. this is my way of getting all callbacks to a single thread... :)
[10:25] guido_g pieterh: this kind of _design_ is tought all over the place, will be hard to negate the included brainwashing
[10:26] pieterh ASY: have you studied the asyncsrv example in the Guide?
[10:26] pieterh guido_g: seems so...
[10:26] guido_g but hey, that's the price to pay to be in an elitist community ]:->
[10:26] pieterh guido_g: we'
[10:26] pieterh we're elitist?
[10:27] pieterh I thought we were cryptoanarchists
[10:27] guido_g sure, we're everything we want to be
[10:27] guido_g oh, shit, this would be punk... ,)
[10:28] pieterh :)
[10:29] ASY async callbacks come from multiple unrelated existing subsystems, so there is really nothing I can do other then making my own custom event queue, which is pointless since I have zmq. the advantage of using zmq with this is that I do also have multiple processes and multiple threads that push data through individually established sockets to the same queue. so really it is only a segment of
[10:29] ASY the application that suffers from this due to 3rd party callbacks.
[10:30] pieterh ASY: I'm sure you're right, it's hard to tell from where I'm sitting
[10:35] ASY i have another question if you don't mind. I have an application that does broadcast UDP discovery of it's siblings on the network... every system broadcasts on a known port and learns about the presense of the others. I am wondering if there is anything in zmq that can be used to acheive something like that.
[10:36] guido_g pub/sub using a common channel for discovery
[10:36] guido_g you might want to use (e)pgm for that
[10:37] ASY ok, i will read up on that and try to figure it out.
[10:38] guido_g basically you use a well known multicast group instead of the subnets broadcast address
[10:40] ASY ok, i will start working on that in about a week probably, will probably be back with questions :)
[10:40] guido_g np
[10:41] ASY 6:40am :/ gotta crash
[10:41] guido_g :)
[10:41] ASY where are you guys located?
[10:41] guido_g europe
[10:41] guido_g so it's time for lunch
[10:42] ASY :)
[10:42] ASY ok, thanks for everyones help. bbl. cheers.
[10:42] guido_g good night!
[11:02] Guthur did someone finally get 0MQ on android
[11:03] pieterh Guthur: well, I've written up the protocol so it's possible to make it in Javascript now, I guess
[11:03] pieterh I'm doing some Android development but not in C
[11:03] pieterh Also, I'd probably not use 0MQ across an open internet in any case
[11:04] Guthur I just got an android phone yesterday and development some development for it is on my todo list
[11:04] pieterh tbh I'd make a http bridge that maps req/rep and pub/sub to simple HTTP methods
[11:04] Guthur ah yes, internet...
[11:05] pieterh get URI -> long poll to receive next sub message
[11:05] pieterh post URI -> publish
[11:05] pieterh etc.
[11:05] Guthur have you looked at websockets
[11:05] pieterh yes yes but it's not stable
[11:06] pieterh there are several versions and its still in evolution
[11:06] pieterh it would be the right solution if/when it emerges from its IEFT party
[11:06] Guthur yeah, I actually mistakenly assumed it had become stable by now
[11:08] pieterh subscribe to the IEFT HyBi list for great merriment
[11:11] Guthur pieterh, do you know if android allows raw TCP connections, so that 0MQ could work over the internet if a secure connection wasn't a requirement
[11:12] pieterh Guthur: afaik yes, it does
[11:12] pieterh in fact it's 100% certain that it does
[11:13] djc longpolling is kind of an ugly hack
[11:13] djc but mapping req/rep to HTTP does make sense
[11:13] djc do the pyzmq guys hang out here?
[11:13] pieterh longpolling is a hack but not any different than waiting on a TCP socket, really
[11:14] pieterh especially if you return a stream rather than just one message
[11:34] guido_g djc: the wrapper devs are here often, if you need them better use the mailinglist
[11:34] guido_g *are not
[11:34] djc I filed an issue
[11:35] guido_g nothing serious
[11:38] guido_g here it works
[11:38] guido_g did you try a specific version?
[11:39] guido_g or just the head of master?
[11:39] djc 2.1.4
[11:39] guido_g ah ok
[11:40] guido_g 2.1dev works
[11:41] guido_g aka latest master
[11:44] djc did you reproduce my issue with 2.1.4?
[11:44] guido_g not yet
[11:47] guido_g nope, works fine here
[11:47] guido_g -> Ran 73 tests in 8.164s
[11:48] guido_g so it might be something blocking the network address and/or port
[11:48] guido_g did you check that?