Tuesday July 13, 2010

[Time] NameMessage
[05:57] CIA-17 zeromq2: 03Martin Sustrik 07master * rda49e5a 10/ (src/forwarder.cpp src/queue.cpp src/streamer.cpp): devices exit in case of context termination -
[09:35] jugg sustrik: Did you intend to revert commit 240fc3 with commit 27877d?
[09:40] sustrik which commit is that?
[09:40] sustrik the latest commit seems to be da49e5a4dd4602bf8931
[09:42] jugg 240fc33f65c6cd9f1ed0a511daf4ad00ff37f163 "minor comment clarification" was reverted by 27877d73ea7dd972a773c7e960706130daaf5925 "EHOSTUNREACH is acceptable outcome from connect"
[09:44] sustrik a-ha
[09:44] sustrik you are right
[09:44] sustrik let me fix it
[09:46] CIA-17 zeromq2: 03Martin Sustrik 07master * r2699043 10/ src/tcp_connecter.cpp : minor comment clarification -
[09:46] sustrik done
[09:47] sustrik thanks for spotting it
[09:48] jugg yah. btw. I haven't forgotten the multi-part messaging topic, I just haven't gotten around to posting to the list yet. I have a draft saved, and will do so *soon*.
[09:49] jugg at the moment I'm trying to update the erlang bindings which I think depend on your "migrating sockets between threads" commit.
[09:50] sustrik yeah, it's still in sustrik/zeromq2, but you should be able to test with it at leasty
[09:50] jugg yep
[09:52] jugg how do you plan on reconciling your master branch and zeromq master now that they have begun to diverge? Will you be rebasing from time to time? Or perhaps move to using temporary topic branches for experimental work?
[09:54] sustrik it's not completely clear yet
[09:54] sustrik but as for the "socket migration" work
[09:55] sustrik the plan is to finish it at sustrik/zeromq2, then merge it to zeromq/zeromq2 as a single patchset
[09:55] sustrik afterwards, there should be testing phase
[09:55] sustrik then a new commit
[09:55] sustrik 2.0.8 presumably
[09:56] sustrik s/commit/release
[09:57] jugg ok, thanks.
[10:55] jugg sustrik: it would appear that the proposed "zmq_process" api is missing from the socket migration?
[10:56] sustrik what zmq_process?
[10:56] sustrik there's no API change
[10:56] sustrik ah, got it
[10:57] sustrik you mean polling on 0MQ sockets using POSIX poll, right?
[10:57] sustrik yes, it's missing, but actually, i've thought of better API in the meantime
[10:57] sustrik the whole thing can be accomplished using 2 socket options
[10:58] sustrik getsockopt (ZMQ_FD) would return an fd you can poll on
[10:58] sustrik once it signals POLLIN, it means something happened with the socket
[10:58] sustrik so you can do getsockopt (ZMQ_EVENTS)
[10:58] sustrik which would return a combination of ZMQ_POLLIN and ZMQ_POLLOUT
[11:02] jugg ZMQ_EVENTS currently does not yet exist?
[11:05] sustrik it does not, but feel free to add it
[11:05] sustrik socket_base_t has two functions
[11:05] sustrik has_in () and has_out ()
[11:05] sustrik both returning bool
[11:06] sustrik just call the two and return the result as the EVENTS socket option
[11:08] jugg ok, looks like that is in the current erlang bindings patch against zeromq.
[11:09] jugg so if I understand correctly, the migrating socket between threads commit does away with the need for the zmq_process api addition.
[12:09] sustrik jugg: yes
[16:23] sd88g93 one problem: when i send double word size values, the same values dont appear on the recieving end. All the examples ive tried before are limited to strings and work fine. is there a special way to reconstruct multibyte data types ? note: it is not an endian issue because i'm still working on localhost network.
[17:06] guido_g there is no way of de-/serializing of non-strings
[17:24] cremes sd88g93: show us a pastie of the code you are using to transmit and receive
[17:34] sd88g93 ok, here's a pastbin:
[17:34] sd88g93 the client code is on top, server code on bottom
[17:35] sd88g93 i try to read the first uint32 recieved, and it's always a 0 , when i assigned it a one in the client code , this is req/rep sockets
[17:42] guido_g the initialization of the REP socket is missing
[17:43] guido_g i *guess* the client connects to publish_skt
[17:43] sd88g93 its up at the top , near /* server code */ comment
[17:43] guido_g no, it's not
[17:44] sd88g93 no, the publish socket is something else, the client connects to rep socket
[17:44] guido_g no it doesn't
[17:44] sd88g93 line 79
[17:44] guido_g client connects via tcp, qskt is inproc
[17:45] sd88g93 yeah, that's just the thread, there's a main that establishes a queue for the xreq/xrep
[17:45] sd88g93 but the code recieves a messsage, but interprets the cmd variable as zero, when i send it as 1
[17:45] guido_g ok, can't tell what else is there, sorry
[17:46] sd88g93 no problem, i was just trying to make it simple case, didnt want to post too much
[17:48] sd88g93 only thing i can see different, is all the examples send char's, and i'm sending DWORD's ,
[18:00] cremes sd88g93: i suggest reducing this case to the simplest thing possible; you have all sorts of commands and frames and everything else in there
[18:00] cremes there is a lot of code noise so it's hard to zero in on the problem
[18:00] sd88g93 yeah, true
[18:01] sd88g93 one question though, when you send a uint32_t type, is the correct way to read the stream back into a variable by simply casting it to an uint32 ptr and dereferencing ? or perhaps is there a better way ?
[18:02] cremes sd88g93: i see a problem with line 103; since you are sending from a REQ to a XREP, the *first* message
[18:02] cremes you receive is the socket identity
[18:02] cremes s/message/message part/
[18:02] cremes so that's probably why this is breaking
[18:02] cremes change it to a REP socket (which handles that socket identity stuff for you) and try again
[18:03] cremes you need to be really careful when using XREQ/XREP sockets because some of the 0mq magic doesn't happen automatically anymore
[18:05] sd88g93 can i change the client socket to xreq ?
[18:07] cremes sd88g93: sure
[18:07] cremes it will "silently" send the socket identity as the first message part
[18:08] cremes your xrep code needs to receive that message part first; then it can grab the cmd, nframes, hash, whatever
[18:10] sd88g93 how big is the socket identy ? one byte ? 4 bytes ?
[18:11] cremes it's the first message part; it can be up to 255 bytes
[18:12] sd88g93 oh ok, so i just recv the first, throw it away , then get the second part
[18:12] cremes oh yeah, you don't need to do zmq_init_msg_* stuff when receiving
[18:12] cremes 0mq allocates the msg struct for you and passes it in
[18:12] sd88g93 not even zmq_msg_init() ?
[18:12] cremes nope; look at the examples again for receive
[18:13] sd88g93 oh ok, i think some of the examples do it that way
[18:13] cremes i'm doublechecking now...
[18:15] cremes nope, you definitely don't need to
[18:16] cremes you only need zmq_msg* and friends for allocating structures to *send*
[18:20] sd88g93 oh good
[18:22] sd88g93 i notice zeromq is fast evolving, good to see
[18:22] sd88g93 just a look on the mailing list shows a lot of progress, in the mail archives
[20:24] sustrik sd88g93, cremes: you DO have to call zmq_msg_init before zmq_recv
[20:25] sustrik the point is that zmq_recv deallocates old content of the message
[20:25] sd88g93 yes, just found that out, but you dont have to before recv() , right ?
[20:25] sustrik if there are bogus data in zmq_msg_t the deallocation can result in undefined behaviour
[20:26] sustrik there's no recv on 0mq sockets, just zmq_recv
[20:26] sd88g93 yeah, that's what i'm using
[20:26] sd88g93 i'm using c interface
[20:26] sustrik then call zmq_init before zmq_recv
[20:27] sustrik (if the zmq_msg_t is uninitialised)
[20:48] sd88g93 looks like if i make the pipe in the main server program with sockets XREP/XREQ and then in the individual threads as just REP , then i dont get the prepended socket id on the incoming pipe
[20:48] sd88g93 and then regular REP socket in the client
[20:52] cremes sustrik: noted; some of the examples put up on the blog are wrong then
[20:53] cremes sd88g93: that is correct; you only get the socket id when using the XREQ/XREP sockets; using REQ/REP will hide that detail
[20:54] sustrik cremes: which one?
[20:56] cremes sustrik:
[20:56] cremes perhaps others
[20:59] cremes nevermind; it's using the c++ wrapper so a call to zmq::message_t reply is calling one of the constructors
[20:59] cremes which calls zmq_msg_init behind the scenes
[21:00] sustrik yes
[21:51] speedy1 i have one small coding issue regarding 0mq - any devs around, perhaps?
[22:28] cremes speedy1: irc etiquette says to just ask the question and stay in the channel to see if anyone can answer
[22:28] cremes so ask
[22:42] speedy1 cremes: sorry about that - i have a deadlock on waiting for the 0mq worker thread to exit
[22:43] cremes what do you mean deadlock? are you blocked on a socket send or receive?
[22:43] speedy1 basically zmq_term(zmq_context) waits on WaitForSingleObject(descriptor, INFINITY), indefinately
[22:43] cremes is that windows?
[22:43] speedy1 yep
[22:43] speedy1 0mq integrated inside Autodesk 3D Studio MAX :)
[22:44] cremes sounds like it might be a bug; i do know that the semantics of zmq_term are changing in the next release
[22:44] speedy1 (it's a MAX plug-in, actually)
[22:44] cremes if you are using 2.0.7 it should unblock any blocked sockets with ETERM, i believe
[22:44] speedy1 eh? in which way?
[22:44] speedy1 yep 2.0.7
[22:44] cremes take a look at the ML archives for the past 5 days or so; lots of discussion on what to do with zmq_term and related issues
[22:45] cremes i don't know that a decision has been reached yet
[22:45] cremes but for now, zmq_term should interrupt those sockets and terminate
[22:45] cremes if it isn't, i would file a bug report
[22:45] speedy1 mm.. and the behaviour in 2.0.7 is?
[22:46] speedy1 does it interrupt? or just stalls?
[22:46] cremes interrupt
[22:46] speedy1 (btw. i'm closing all the sockets before calling zmq_term)
[22:46] cremes see the docs here:
[22:47] cremes wait, you don't have any open sockets when you call it?
[22:48] speedy1 i think so - i close the only socket I have open..
[22:48] speedy1 before calling zmq_term()
[22:48] speedy1 i have to check the return code, though
[22:48] cremes regardless, it shouldn't be blocking on anything
[22:48] cremes have you tried reducing this to a simple code example?
[22:48] speedy1 it's like the worker thread does not exit when it should
[22:49] cremes it sounds like you have a lot going on with your code; this might not be a 0mq issue
[22:49] speedy1 yep, it's part of a big app (3D Studio MAX is one big beast)
[22:49] cremes i highly suggest eliminating all extraneous logic and writing a small code example that opens a socket, does something with it, closes it and terminates
[22:49] cremes if that hangs, then there is a library problem
[22:49] cremes i
[22:50] cremes if it doesn't, then there might be a problem elsewhere; time to divide and conquer
[22:50] speedy1 i'll try it.. it could be some kind of interaction between MAX and 0mq
[22:50] cremes sure
[22:51] speedy1 in windows, compared to unix / linux, many things can go haywire inside the process
[22:51] speedy1 do you perhaps know where in the code does the worker thread get signaled to quit?
[22:52] speedy1 and where does it decide to quit?
[22:52] cremes i do not know
[22:52] speedy1 i could start singlestepping in the debugger from those points..
[22:53] cremes save yourself a ton of trouble and try reducing the error to a simpler test case
[22:53] cremes unless you love single-stepping with your debugger ;)
[22:53] cremes gotta run to the store; brb
[22:54] speedy1 kk, thanks! :)
[23:11] locks hi, can anyone help me with installing the ruby gem?
[23:11] locks extconf can't find the zmq libs
[23:11] locks I'm on OSX