Thursday April 8, 2010

[Time] NameMessage
[05:03] sustrik mikko: Assertion failed: !sending_reply || !more || reply_pipe != pipe_ (rep.cpp:96)
[05:03] sustrik are you playing with multi-part messages?
[06:34] CIA-5 zeromq2: 03Martin Sustrik 07master * r38e9103 10/ src/object.cpp : issue 13 (Assertion failed: load.get () == 0 (epoll.cpp:49)) fixed -
[07:20] mikko sustrik: nope
[07:22] mikko normal request reply
[08:15] mikko sustrik: although i still can't get gdb to work
[08:19] sustrik mikko: i've fixed thr load.get () == 0 problem
[08:19] sustrik now looking at !sending_reply || !more || reply_pipe != pipe_ (rep.cpp:96)
[08:19] mikko sustrik: nice!
[08:23] mikko i wonder what's going on now
[08:24] mikko
[08:24] mikko it get's stuck there
[08:26] mikko that is request-reply pattern: create server, create client, client sends, server recvs, server sends, client recvs
[08:27] sustrik can you do a quick check with local_lat/remote_lat?
[08:27] sustrik it's the same pattern
[08:28] sustrik and the test used to work before the multi-part message changes
[08:28] mikko downgrading to 2.0.6 works
[08:28] sustrik ok
[08:28] mikko ok, give me a sec
[08:29] mikko local_lat ?
[08:29] sustrik perf/local_lat
[08:29] sustrik latency test
[08:30] sustrik local_lat tcp://lo:5555 1 1000
[08:30] mikko ah
[08:30] sustrik remote_lat tcp://localhost:5555 1 1000
[08:30] mikko Assertion failed: false (xrep.cpp:37)
[08:30] mikko Aborted
[08:30] mikko tested with XREP/XREQ
[08:30] sustrik these are not functional at the moment
[08:30] sustrik they are not even documented
[08:31] mikko ok
[08:31] mikko message size: 1 [B]
[08:31] mikko roundtrip count: 1000
[08:31] mikko average latency: 17.049 [us]
[08:33] sustrik hm, that works
[08:33] sustrik can i get the test peogram that caused the problem?
[08:33] mikko let me make a reproducable case
[08:33] sustrik thanks
[08:34] mikko sustrik: it's one of the php tests
[08:34] sustrik :|
[08:34] mikko i should be able to make a C test case
[08:45] mikko no, can't reproduce outside php extension
[08:46] mikko REQ send, REP recv
[08:46] mikko it gets stuck on recv
[08:49] mikko similar thing works if i run client and server in different processes
[08:49] sustrik it's inproc transport?
[08:49] mikko tcp://
[08:50] sustrik within a single process?
[08:50] mikko yes
[08:50] sustrik hm, given that gdb doesn't work for you, there's no chance of getting a backtrace, right?
[08:51] mikko i can test on more stable environment later
[08:52] sustrik ok
[08:53] sustrik now for the !sending_reply || !more || reply_pipe != pipe_ (rep.cpp:96) problem
[08:53] sustrik it looks like there must be multipart message involved somehow
[08:53] sustrik strange
[08:53] sustrik is it php or c test?
[08:53] mikko php
[08:54] sustrik 2 endpoints?
[08:54] mikko in two separate processes
[08:54] sustrik tcp?
[08:54] mikko yes
[08:54] mikko REQ/REP
[08:54] sustrik ok, do you know how to work with wireshark?
[08:54] sustrik or tcpdump
[08:54] mikko i can tcpdump
[08:55] mikko wireshark requires a qui iirc?
[08:55] sustrik there's command line version called tshark
[08:55] sustrik tcpdump is ok
[08:56] sustrik i just want to know what data are passed between the two
[08:56] sustrik can you get that for me?
[08:58] mikko ok, i'll try to get a dump
[09:00] mikko can't seem to lure it out after the latest epoll fix
[09:02] mikko i wonder if it was related to that (something wrong in my end)
[09:08] CIA-5 zeromq2: 03Martin Sustrik 07master * r77cbd18 10/ src/pgm_receiver.cpp : issue 11 - Assertion failed: it != peers.end () (pgm_receiver.cpp:161) -
[09:20] mikko sustrik: now i can reproduce
[09:20] mikko time for tcpdump
[09:22] olivier_c hello everybody
[09:24] sustrik olivier_c: hi
[09:26] mikko sustrik:
[09:26] mikko that is tcpdump -vvv -w comm.pcap -i lo port 5555
[09:38] olivier_c sustrik, i've a little question about the behavior of sub multiple sockets.
[09:38] sustrik sure
[09:38] olivier_c *multiple sub
[09:39] sustrik mikko: (inspecting the pcap)
[09:39] olivier_c if i create 2 sub sockets (in different threads), and send messages on them, the reception is "alternate". i-e socket_one receive a message, then zmq is waiting on socket_two, and ignore messages arriving on socket_one.
[09:39] olivier_c multiple sub sockets can't receive messages simultaneously ?
[09:40] sustrik they should
[09:41] sustrik what socket type are you using on sender side?
[09:41] olivier_c ZMQ_PUB
[09:42] sustrik " if i create 2 sub sockets (in different threads), and send messages on them"
[09:42] sustrik you mean "to them"
[09:42] sustrik right?
[09:42] olivier_c yes,
[09:43] mikko sustrik: possibly related symptom: my poll server example stopped working as well, it doesn't send a message even though zmq_send returns 0 and the socket is continously marked as writable
[09:43] mikko which causes eternal loop in polling
[09:43] sustrik olivier_c: do you have a test program?
[09:44] sustrik mikko: ok, as for the tcp dump
[09:44] olivier_c yes, i send you this in few minutes
[09:44] mikko hopefully gdb works on my other box
[09:44] sustrik there seem to be 3 connections established
[09:44] sustrik right>
[09:44] sustrik ?
[09:44] sustrik olivier_c: please, do
[09:46] sustrik the on one of them "hello there!" message is sent
[09:50] sustrik mikko: a double check - the pcap you've sent caused the !sending_reply || !more || reply_pipe != pipe_ (rep.cpp:96) assertion, right?
[09:52] mikko sustrik: yes
[09:52] sustrik ok
[10:03] sustrik mikko: hm, there seems to be no way how this can heppen :(
[10:03] sustrik it's php?
[10:03] mikko yep
[10:03] mikko seems like most of the things are not working with github master
[10:04] mikko the poll-server.php example gets stuck in eternal loop where socket is constantly writable
[10:04] mikko even though message was just sent
[10:04] mikko but the client does not receive the message
[10:04] mikko downgrading to 2.0.6 and everything works
[10:05] sustrik it possibly has to do with multipart messages
[10:05] sustrik i have to get it reproduced somehow...
[10:05] mikko you could try with the php extension?
[10:05] sustrik if you instruct me how to do it
[10:06] mikko
[10:06] mikko tar xfj php-5.3.2.tar.bz2
[10:06] mikko cd php-5.3.2
[10:06] mikko ./configure --disable-all --prefix=/opt/local --enable-debug
[10:07] mikko make && make install
[10:07] mikko to get php running with debug
[10:07] mikko to install zeromq extension
[10:07] mikko git clone git://
[10:07] mikko cd php-zeromq
[10:08] mikko /opt/local/bin/phpize && ./configure --with-php-config=/opt/local/bin/php-config
[10:08] mikko make install
[10:10] sustrik mikko: done
[10:10] sustrik what now?
[10:12] mikko ls -l /opt/local/lib/php/extensions/debug-non-zts-20090626/
[10:12] mikko do you have there?
[10:12] sustrik yes, i do
[10:12] mikko create new file /opt/local/lib/php.ini and add:
[10:12] mikko extension_dir=/opt/local/lib/php/extensions/debug-non-zts-20090626/
[10:12] mikko
[10:12] mikko after that /opt/local/bin/php -m should show zeromq in the extension list
[10:13] mikko then in the php-zeromq dir run /opt/local/bin/php examples/poll-server.php
[10:13] mikko and the client /opt/local/bin/php examples/client.php
[10:13] mikko the client should send and recv single message
[10:14] sustrik it did
[10:15] mikko are you using 2.0.6 ?
[10:15] sustrik it's trunk
[10:15] sustrik i think
[10:15] sustrik let me double check
[10:15] mikko for me it works with 2.0.6 but trunk gets into eternal loop on the server side
[10:16] sustrik how do i know?
[10:17] sustrik it says: Received message: hello there!
[10:17] sustrik then does nothing
[10:17] mikko does the client exit after that?
[10:17] sustrik nope
[10:17] sustrik one cpu core 100%
[10:17] mikko yep
[10:17] mikko the server is in a loop
[10:18] mikko exit the server using ctrl c
[10:18] sustrik zmq_poll return POLLIN
[10:18] sustrik but there's no message
[10:18] mikko if you rerun the same thing and exit client with ctrl c
[10:18] sustrik right?
[10:18] mikko Received message: hello there!
[10:18] mikko Assertion failed: !sending_reply || !more || reply_pipe != pipe_ (rep.cpp:96)
[10:18] mikko Aborted
[10:18] mikko yep
[10:18] mikko if the client goes away that assertion should appear
[10:19] sustrik bingo!
[10:19] sustrik thanks for helping me with reproducing it
[10:21] mikko no prob
[10:24] mikko it might be my code as well, but seems like with 2.0.6 things work as expected
[10:52] sustrik mikko: how is it linked?
[10:52] sustrik if i tweak and reinstall 0mq
[10:52] mikko dynamic linking
[10:52] sustrik will it do
[10:52] sustrik ok
[10:52] mikko ldd /opt/local/lib/php/extensions/debug-non-zts-20090626/
[10:52] mikko you can make clean && make install in case there are ABI changes
[11:15] mikko sustrik: installed debian stable and gdb works there
[11:15] mikko broken in testing
[11:18] sustrik fine
[11:21] olivier_c Ah, thanks for the answer sustrik :)
[11:22] sustrik you are welcome
[11:28] olivier_c btw, other little things : there is still the same problem for doc when install (solution : touch doc/zmq.7) and after that "make" doesn't find rules for "forwarder.1". (i've comment some lines in Makefile of the "doc" repertory to ignore this)
[11:33] sustrik olivier_c: that's 2.0.6 or trunk?
[11:35] olivier_c from git, "sustrik-zeromq2-0f7aab5.tar.gz"
[11:36] sustrik can you paste the error here?
[11:39] olivier_c so the first one (with "./configure --with-pgm")
[11:39] olivier_c checking whether to install manpages... configure: error: configure thinks we want to install manpages but they're not present. Help!
[11:39] olivier_c => touch doc/zmq.7
[11:39] olivier_c then ok
[11:40] sustrik hm, have you run
[11:40] olivier_c yes
[11:41] sustrik let me try it myself...
[11:43] sustrik olivier_c: ok, got it
[11:43] sustrik i'll open a ticket
[11:44] sustrik btw, if you clone the repo using standard git clone, you won't get the error
[11:44] olivier_c re :/
[11:45] sustrik <sustrik> olivier_c: ok, got it
[11:45] sustrik <sustrik> i'll open a ticket
[11:45] sustrik <sustrik> btw, if you clone the repo using standard git clone, you won't get the error
[11:46] olivier_c ok
[12:01] keenerd Hi.
[12:02] keenerd Another makefile question
[12:03] keenerd gets make install'd, and it seems like it should not be
[12:04] sustrik i am not sure
[12:04] sustrik shouldn't it?
[12:05] keenerd It seems to be a bit of scrap from the build process.
[12:05] sustrik yes, it's a libtool bit
[12:06] sustrik i am not sure how it is supposed to work
[12:06] sustrik maybe it can be included by other libtool-based builds?
[12:06] sustrik keenerd: try ask on the mailing list, please
[12:06] sustrik someone may have an idea about it
[12:07] keenerd I'm still trying to sort it out.
[12:08] keenerd "Optional and frowned upon, but required in some magical cases" seems to be the opinion of .la files.
[12:09] sustrik so i would say let's rather keep it there
[12:09] sustrik in case a magical case happens
[12:09] keenerd I have a user complaining otherwise.
[12:10] sustrik why so?
[12:10] sustrik does it mess with anyting?
[12:11] keenerd Not sure. It seems my distro has a policy against .la files, that they can quickly become a cross linked mess.
[12:11] keenerd (Every week [for years now] I learn something new about the underbelly of linux, it seems.)
[12:12] sustrik well, autotools by itself is complex and mysterious
[12:12] keenerd True.
[12:12] sustrik anyway, I have no opinion on the matter
[12:12] sustrik ask on the mailing list
[12:12] sustrik if nobody complains we can try to remove it
[12:13] keenerd Righto. Thanks.
[12:15] keenerd Wow, the zmq site is hilariously broken in elinks :-)
[12:15] sustrik works for me...
[12:16] sustrik (firefox)
[12:17] sustrik here's the direct link:
[12:19] keenerd Yeah, that is the page. Had to fire up X just to look at it.
[12:20] keenerd It ended up crashing Elinks, maybe a stack overflow from some bad JS?
[12:24] sustrik no idea, it's hosted on a wikifarm
[12:24] sustrik so we don't write the html ourselves
[12:32] keenerd ML'ed.
[12:33] keenerd Thanks again.
[13:40] mikko sustrik: did you find out anything more regarding the assertion?
[14:15] sustrik mikko: not yet
[14:15] sustrik been out of office
[14:41] sustrik mikko: you there?
[14:41] sustrik here's the problem
[14:41] mikko im here
[14:41] sustrik zeromq.c:282
[14:41] sustrik
[14:41] sustrik oops
[14:41] sustrik rc = zmq_send(intern->zms->socket, &message, zmq_msg_size(&message));
[14:41] sustrik the third parameter is 'flags' not 'size'
[14:41] sustrik the message contains the size all right
[14:41] sustrik you don't have to specify it second time
[14:42] mikko makes a lot of sense
[14:42] sustrik accidentaly message size of "Got it!" is 7
[14:42] mikko i guess flags were unused in 2.0.6?
[14:42] sustrik which means bit 0 is set
[14:42] sustrik there's a ZMQ_NOBLOCK flag
[14:43] sustrik you haven't hit the problem by chance
[14:44] sustrik still, it shouldn't fail
[14:45] sustrik i'll fix my part
[14:45] sustrik you fix yours
[14:45] mikko thanks alot
[14:45] sustrik np
[16:01] sophacles Hey all, slightly offtopic, but I'm instrumenting some systems to learn how they operate under various forms of contention. CPU loading is cake, as is CPU+Network loading. Now im looking for a tool to fill the network pipe without bogging down the cpu. Does anyone here have experience with such a tool?
[16:03] sophacles (all the ones ive come across so far use 100%cpu when filling the pipe)
[16:20] sustrik no idea, iperf?
[16:36] sophacles unfortunately iperf eats all the cpu. thanks for the suggestion tho!
[16:41] sustrik :\
[16:41] sustrik sophacles: what about local_the/remote_thr?
[16:41] sustrik with a VERY BIG MESSAGE
[16:50] sophacles not sure i follow...
[16:52] sustrik sophacles: 0MQ performance test
[16:52] sustrik it does nothing but send large chunk of memory down TCP
[16:52] sophacles cool... ill give that a try!
[16:53] sustrik ./configure && make
[16:53] sustrik then it's in perf subdirectory
[16:54] sophacles yeah, i see them there... i just never thought to actually *use* them :)
[16:54] sophacles kind of a facepalm moment on my part really
[16:55] sustrik try using message size of several megabytes
[16:55] sustrik that should ensure the least possible CPU usage
[16:56] sophacles ok, thanks for the tip... ill let you know how that works once i get the stuff pushed to all the right places :
[17:05] CIA-5 zeromq2: 03Martin Sustrik 07master * r5cd9f74 10/ src/rep.cpp : few fixed related to multi-part messages in REP socket -
[17:21] CIA-5 zeromq2: 03Martin Sustrik 07master * r027bb1d 10/ src/zmq.cpp : issue 10 - zmq_strerror problem on Windows -
[21:08] sophacles sustrik: so, with the latest zeromq2 stuff debian packaging should probably be cleaned up to not include all the language bindings, correct?
[22:37] dirtmcgirt hi all. getting a "Exception in thread "main" java.lang.UnsatisfiedLinkError: /usr/local/lib/libjzmq.0.dylib: no suitable image found." with the latest checkout of jzmq
[22:37] dirtmcgirt trying the test program: "java -Djava.library.path=/usr/local/lib -classpath ../src/Zmq.jar:. local_lat tcp:// 1 100"