Saturday November 6, 2010

[Time] NameMessage
[07:02] CIA-20 zeromq2: 03Martin Sustrik 07master * rc021702 10/ src/mailbox.cpp :
[07:02] CIA-20 zeromq2: Coding style cleanup in mailbox.cpp
[07:02] CIA-20 zeromq2: Signed-off-by: Martin Sustrik <> -
[09:07] sustrik takeda: hi
[09:08] sustrik you can add new transports but the interface for a transport is not particularly clean
[09:08] sustrik i would like to clean it up and document it
[09:08] sustrik so that people can add thier own transports as needed
[10:05] xraid how do i read and send to a raw socket from zmq ? preferbly a ipc://
[10:22] sustrik what raw socket?
[10:22] sustrik BSD socket you mean?
[10:27] xraid yes a unix:path socket
[10:46] sustrik BSD sockets have nothing to do with 0mq
[10:46] sustrik use standard BSD socket API
[10:47] dv hi
[10:47] dv can I use the same zeromq context in two threads?
[10:48] xraid sustric whats the -- cremes Guthur: because you can use it to poll a standard socket as well as a 0mq socket
[10:48] xraid PollItem ?
[10:52] dv hi. according to the docs, I can create one context and use it in more than one thread
[10:52] dv but is there something I should know about tring to create a publisher in one thread and a subscriber in another?
[10:53] dv I am trying to create a chat program using pub-sub and multicasting
[10:53] dv (epgm)
[10:53] dv previously I used two contexts, one in each thread, this didnt work.
[10:54] dv i dont need synchronization - it doesnt make sense, anyway, since every chatter is a publisher -and- a subscriber
[11:36] sustrik dv_: the context issue is irrelevant
[11:37] sustrik try starting with tcp transport
[11:37] sustrik when it works switch to epgm
[11:38] sustrik xraid: you've asked about send and recv -- use send(2) and recv(2) for those
[11:38] sustrik polling is a different issue
[11:38] sustrik as you may want to combine raw sockets and 0mq sockets
[11:40] dv hm
[11:40] dv well it works now
[11:40] dv with tcp, it is set up immediately
[11:40] dv but with epgm, connecting takes about 3 seconds
[11:41] dv multicast rate is set to 10000
[12:39] dv is epgm broken in zeromq?
[12:39] dv when I put a sleep(1) before the connect() call in the receiver , i get an "Invalid argument" error
[13:00] dv uh, it gets weirder. apparently, this is related to gcc. I build zeromq with -O0 and everything works just fine... ?!?!?
[13:12] mikko dv_: are you using master or maint?
[13:16] dv zeromq 2.0.10
[13:16] dv nothing from git
[13:16] dv no something is very wrong. I recompile the app twice, no changes to the code at all.
[13:17] dv first time : "invalid argument", second time it works
[13:20] dv hmm. is this maybe a race condition? when I step through the connect function, it works. when I dont step, it fails.
[13:20] dv (stepping with gdb i mean)
[13:24] mikko which version of gcc?
[13:24] mikko sounds a bit odd indeed
[13:24] mikko have you got reproducable test case?
[13:25] dv here, I reduced this to this code:
[13:25] dv this code sometimes succeeds
[13:25] mikko testing
[13:25] mikko sec
[13:25] dv and sometimes I get terminate called after throwing an instance of 'zmq::error_t' what(): Invalid argument
[13:25] dv built with g++-4.4 -Wextra -Wall -std=c++0x -pedantic -o epgm_test epgm_test.cpp -lzmq
[13:26] mikko need to build with pgm
[13:26] mikko second
[13:26] dv okay. it also happens with fewer flags btw. : g++-4.4 -o epgm_test epgm_test.cpp -lzmq
[13:27] mikko im testing with git trunk
[13:27] mikko havent used maint in quite a while
[13:27] dv oh, you might want to adjust the URL
[13:27] dv I use eth1
[13:29] mikko
[13:30] mikko this is with trunk
[13:30] dv yes. now try with -O0
[13:30] dv then with -O2
[13:30] mikko no wait
[13:30] mikko
[13:30] mikko look
[13:30] mikko it died after a while
[13:31] dv uh, never had that
[13:31] mikko i think i've seen this before
[13:31] Ghpu hello
[13:31] dv did you set your app to sleep?
[13:31] mikko same thing with -O0
[13:31] dv because, my code exits immediately
[13:31] mikko no, it's blocking
[13:31] dv hmm it does not block here
[13:32] dv see why I suspect that something's wrong about epgm ?
[13:32] mikko different semantics in trunk
[13:32] mikko possibly
[13:32] mikko let's see where it's blocking
[13:33] mikko
[13:34] Ghpu I've just succeeded in having a working version of ZeroMQ 2.1 from trunk ported to the Android platform \o/
[13:34] mikko Ghpu_: nice!
[13:34] Ghpu It required to rebuild the libuuid first
[13:35] dv hmm.
[13:35] Ghpu and comment out some POSIX optimizations
[13:35] dv it seems to me as if multicast connect works *sometimes*
[13:35] dv and sometimes it doesnt
[13:35] Ghpu i made a package temporarily available on if someone is interested
[13:36] dv i'll try out trunk
[13:36] Ghpu it required to use an unofficial ndk, which supports full stdc++ and
[13:36] Ghpu stl
[13:36] Ghpu but I think it's worth it
[13:39] dv oh now this is interesting. trunk uses a much more recent version of openpgm.
[13:39] mikko dv_: yes, as far as i know there is going to be openpgm related testing soon
[13:39] mikko to iron out any bugs
[13:40] mikko Ghpu_: this sounds like an interesting project
[13:40] dv hmm. doesnt seem to happen anymore.
[13:41] Ghpu Well, I got it to talk properly between the emulator and a host program on my desktop computer
[13:41] mikko Ghpu_: is this 'easy' to cross-compile?
[13:41] Ghpu I will put a small page on my website to host it for now
[13:41] dv oh but I got your assertion, mikko
[13:41] dv so trade one bug for another? :)
[13:41] Ghpu well, my package is a copy of the needed file with a compile script
[13:41] mikko dv_: i think it would be good to make a test case out of this
[13:41] dv yes, unfortunately it doesnt seem to be deterministic
[13:41] dv as in "run it 1000 times"
[13:42] Ghpu but the files are slightly modified from the trunk, and the directory structure is different
[13:42] Ghpu so I don't see an easy integration now
[13:42] mikko Ghpu_: does the directory structure need to be different?
[13:42] Ghpu but the overall cross-compilation should be easy
[13:42] Ghpu well, the cross-compilator expects a specific structure
[13:42] mikko Ghpu_: it would be good if this could be automated so i can add a daily build for it
[13:42] Ghpu as far as i know
[13:42] mikko
[13:43] Ghpu I still need more time to make a proper diff/patch to the zmq files
[13:43] Ghpu this is only my first working step
[13:43] mikko Ghpu_: do you see this as something that could be cleanly added to upstream?
[13:43] Ghpu I'm afraid not
[13:44] Ghpu (or not easily for now)
[13:44] mikko ok
[13:44] Ghpu just to let know that it can work
[13:45] mikko that is really good news
[13:45] dv yes, seems very much like a race condition. perhaps I can look into it a bit more
[13:45] mikko dv_: that would be great
[13:46] dv gotta go now, however. thanks for the help guys, i'll keep you posted.
[15:04] Guthur Oh, does Swap need to be set after binding a socket?
[15:07] Guthur actually is that the case for all options?
[15:08] Guthur If I set HWM and SWAP before binding for the durable PubSub it doesn't seem to work as expected, is that par for the course?
[19:38] shales is there a recommended way of setting the file permissions and group of the unix domain socket when binding an IPC socket?
[19:39] shales should I set the umask and setguid before calling bind? does the underlying call to bind happen before 0mq's bind returns or at some later time?
[19:43] shales come to think of it, how does 0mq work when an IPC socket connects to a unix domain socket whose file is left over from a previous run and the sever endpoint hasn't called bind yet? I think the server side would have to unlink the file before it can bind again, and 0mq seems to do this automatically, but how do the sockets that connect know to reconnect to the new unix domain socket?
[20:10] shales ah, I guess after the server closes, the connect will just fail even if the unix socket domain file still exists
[20:10] shales and 0mq will do its reconnecting
[21:14] Guthur I am getting an assert fail (line 60 xrep.cpp), while trying to run my version of LRUQueue
[21:15] Guthur I am using INPROC instead of IPC, due to Win platform; this shouldn't make a difference, correct?
[21:31] dv hi
[21:32] dv mikko, I narrowed the bug to code inside openpgm - the parse_interface() function
[21:32] dv it tries to find a multicast capable interface out of the list I gave it, and it seems that the one I give is sometimes multicast capable, sometimes it isn't
[21:42] Guthur Oh the assert fail is sorted, it was an issue with my implementation of z_set_id
[21:42] Guthur it wasn't being random enough
[21:51] dv got it - the assert is gone now
[21:51] dv but I am not sure why
[21:51] dv :)
[21:51] mattl I have a valgrind zmq issue.
[21:52] mattl is this the right forum to ask it in?
[21:52] dv yes, but I guess you have more luck in the mailing list
[21:53] mattl sure. no prob. I will also ask it to the list. But I think my problem is pretty basic. I can't seem to get my test app to recv anything under valgrind
[21:54] mattl I am using the test_taskworker.c example app and running it under valgrind and it no longer receives any info
[22:06] dv alright, posted the issue to the mailing list
[22:07] dv seems to me the openpgm guys wanted to put a little intelligence in their interface checking code, intelligence that goes haywire.
[22:07] dv mattl_: without valgrind it always works?
[22:13] mattl dv_ I feel quite silly now. Just put the sleep() back into the sender, right after the send, and it works now.
[22:13] mattl thanks for the moral support
[22:13] mattl durf..
[22:15] dv :)
[22:15] dv uh, wait
[22:15] dv sleep() after the sender?
[22:16] dv sounds like you are masking a race condition or something