Friday June 25, 2010

[Time] NameMessage
[10:38] vtl sustrik: hi!
[10:38] sustrik hi
[10:39] vtl sustrik: question:
[10:40] sustrik errno is thread-local
[10:40] sustrik so you get err1
[10:41] vtl sustrik: cool, 10x! as i thought... we have different opinion with a guy who forked cl-zmq :)
[10:41] sustrik :)
[10:41] sustrik it's POSIX behaviour
[10:41] vtl yes
[10:41] sustrik let me find the doc
[10:42] vtl man errno, i think
[10:43] sustrik right
[10:43] sustrik it's ISO C
[10:59] vtl I think I understood... In managed environments (like lisp, python, java, .net) thread may be interrupted by runtime for garbage collecting. if GC calls libraries functions or syscalls, it may clobber this thread's errno. When such event happens in between zmq_foo() and zmq_errno(), then zmq_errno() will return wrong errno.
[11:04] sustrik right, that may happen
[11:19] sustrik vtl: thinking about it
[11:20] sustrik in theory, it would be possible to solve the problem in 0mq itself
[11:20] sustrik solve the errno into 0mq-own thread-local variable
[11:20] sustrik and return that one from zmq_errno
[11:20] vtl I think this in not worth trying
[11:21] vtl because other foreign libraries will fail in the same way :)
[11:21] sustrik do you have a better solution?
[11:21] sustrik yes, definitely
[11:21] sustrik how is this solved in cl?
[11:21] vtl this is generally not solved in cl. but one paricular commercial version of CL (this is Allegro) has workaround for it
[11:22] sustrik ok, isee
[11:22] vtl I think, it is possible to hack CFFI library to solve such kind of problems
[11:22] vtl of course, other languages still problematic in this place
[11:34] cremes sustrik: why not make errno a member/property of the context class?
[11:34] cremes the system errno could be copied into that member after each 0mq call
[11:43] sustrik cremes: a socket option
[11:44] sustrik same as with Berkeley sockets (SO_ERROR)
[11:44] sustrik yes, this sounds more sane than zmq_errno
[11:44] CIA-17 zeromq2: 03Martin Hurton 07master * rfca2e8e 10/ (7 files): Add SWAP support -
[11:47] cremes right, move it to the socket class (why would i suggest context? d'oh!)
[11:50] cremes though that's another api change/break
[11:56] sustrik cremes: right
[11:56] sustrik in theory it may be done in 2 steps
[11:57] sustrik 1. add the error socket option
[11:57] sustrik 2. remove zmq_errno
[11:57] sustrik the latter can be done when major version number is bumped
[12:30] jugg sustrik: recent commit "devices exit in case of context termination" 11891d : src/forwarder.cpp lines 33 and 40, I believe are missing the necessary "rc = " assignment?
[12:31] jugg hmm, same thing in src/streamer.cpp
[12:32] sustrik let me see
[12:34] sustrik oops
[12:34] sustrik let me correct it!
[12:41] jugg Using C++ bindings, it seems that the only time one needs to do a msg.rebuild is before a send. But between a send and a recv or between a recv and another recv this is not necessary, as it would appear the zmq_recv internals close a message then re-initializes it as needed. Is this correct?
[12:43] jugg (the same could be stated/asked for the C api, restating with message close/init pairing instead of the C++ message rebuild)
[12:44] sustrik ok, fixed
[12:44] sustrik jugg: yes
[12:44] sustrik the only use for rebuild is when you have a message as a member variable of a class
[12:45] sustrik then you want to say resize it
[12:45] sustrik you would have to destroy it and reinistantiate it
[12:45] sustrik but that's not possible becasue it's a member variable!
[12:45] sustrik so you would have to allocate it dynamically or something...
[12:45] sustrik instead, you can simply call rebuild
[12:48] jugg ok, but whether you destroy or rebuild, this is only necessary if you want to send a message of a different size. Anytime you want to recv a message, there is no reason to free the associated memory from previous use of the message... eg, the zmq internals aren't going to leak memory previously allocated for a message.
[12:50] jugg I'm talking about the zmq_msg_t memory not an instantiated C++ message class.
[12:55] sustrik jugg: yes
[12:55] sustrik i think it's mentioned in the docs
[12:55] sustrik let me see...
[12:55] sustrik zmq_recv(3):
[12:55] sustrik Any content previously stored in msg
[12:55] sustrik shall be properly deallocated.
[13:03] jugg *sigh* I get caught up reading/ walking through the code and forget about the documentation. appologies.
[13:04] sustrik :)
[13:28] jugg I have a setup where a SUB socket binds a TCP port on two different interfaces. There are two PUB sockets that connect, one to each interface of the SUB socket. I experienced an instance with this setup that messages quit flowing from one of the publishers to the subscriber while messages continued to flow from the other publisher.
[13:28] jugg However there were no errors, or anything to indicate anything was wrong besides the fact the messages weren't flowing. At the OS level, the TCP port for the "dead" connection was still active, with one in LISTEN state, and the other in ESTABLISHED for the SUB and PUB socket respectively.
[13:29] jugg Restarting the publisher application restored communications.
[13:30] jugg If this happens in the future, any suggestions on how to inspect what is going on?
[13:33] sustrik jugg: my guess would be that there's a loophole in fair queueing algorithm somewhere
[13:33] sustrik if there are messages available from both publshers SUB socket round robins between them
[13:34] sustrik what you describe looks like SUB erroneously believes one of the pipes has no messages
[13:34] sustrik (although it does)
[13:35] sustrik and doesn't include it into the round robit
[13:35] sustrik robin
[13:35] jugg where should I be looking in the zmq code?
[13:36] sustrik ljugg: are use using zmq_poll or just zmq_recv?
[13:37] jugg just zmq_recv
[13:37] sustrik then it's fq_t::recv
[13:38] sustrik fq.cpp:81
[13:54] jugg change to topic for a moment, working through that code is going to take a bit... on multi-part messages, are the parts stacked up on the sending side or the receiving side?
[14:00] sustrik jugg: both sides
[14:00] sustrik the rule is that they are stacked on the write side of the pipe
[14:01] sustrik one pipe being between sender thread and sender's I/O thread
[14:01] sustrik other one being between receiver's I/O thread and receiver thread itself
[14:27] jugg ok, so before they reach the receivers I/O thread, they've been stacked up in the sender's I/O thread. Then they are all sent to the receiver, and the receiver I/O thread stacks them up until they are all received before passing them off to the receiver thread, yes?
[14:30] sustrik yes
[14:38] jugg ok, so why expose the multipart concept to the receiving side at all then? Why not assemble it all into a single message for final delivery? The above structure provides no benefit for reducing total transfer time, nor allowing the receiving end to work on parts of the messages as they come in, thus reducing total processing time.
[14:39] jugg It seems to me that either the receiving side should just get a single message delivered to the receiver thread, or that it shouldn't be atomic.
[14:41] sustrik jugg: the goal here is to allow for some basic structure in the message content
[14:42] sustrik so if sender has say 3 big matrices in different places in memory
[14:42] sustrik he'll use multi-part message as a means to achieve zero-copy
[14:43] sustrik however, he still wants to tell the boundaries between matrices on the receiving side
[14:43] sustrik that's why boundaries between message parts are honoured
[14:44] sustrik 0mq uses this mechanism under the cover btw to distinguish 0mq-specific data on the wire from the user data
[14:48] jugg ok, that makes sense. So, perhaps two alternate multi-part implementations feature requests then: 1. allow the sending side to send each part immediately, and only stack them on the receiving side I/O thread. 2. allow non atomic multi-part messaging.
[14:49] sustrik 1. makes sense
[14:50] sustrik 2. what would that be good for?
[14:51] jugg 2. A REQ is made (ie SQL query) and the REP has multiple rows, if it was non atomic, then each row could be sent back and be operated on without waiting for the entire set to arrive.
[14:52] travlr my guess might be in a stream processing sense of individual message parts.
[14:52] travlr yeah, what he said :)
[14:53] jugg Another use is a REQ is made, and something more intensive like, a file set - a bunch of images - is returned. These images need to be resized. There is no reason to wait for the entire set to arrive.
[14:58] sustrik jugg: i would say each image (or row) is a separate message in these scenarios
[14:58] sustrik the rationale is that all the elements in the set are of the same type
[14:58] sustrik and thus eligible for parallelised processing or similar
[14:59] sustrik message parts make sense where there are different elements concatenated into a single message
[14:59] sustrik for example 0mq-routing-data + user-data
[14:59] travlr or a topic + user-data
[14:59] sustrik yes
[15:00] sustrik different semantics is the key here
[15:00] sustrik from this point of view atomicity makes perfect sense
[15:00] sustrik it doesn't make sense to deliver just the routing data
[15:00] sustrik or a topic
[15:01] sustrik and say load-balance the user-content somewhere else
[15:02] travlr seeing the big picture along with the nuances is important for the various concepts in 0mq, huh.
[15:02] sustrik yes, this kind of thing is missing from docs :(
[15:02] sustrik but anyway, i have no idea where it shouldb e put
[15:03] travlr i want to help with docs in the near future
[15:03] travlr i'm still studying it all though for now
[15:03] sustrik do you have idea what exactly would you like to do?
[15:03] travlr docs?
[15:04] sustrik yes
[15:04] sustrik I liked Nicholas' blog yesterday
[15:04] travlr yes
[15:04] sustrik it seems this kind of stuff is highly needed
[15:04] travlr very much
[15:04] travlr along the same vain martin
[15:04] travlr s/vain/vane
[15:05] travlr first i want to understand 0mq inside out, which is what i'm working on atm
[15:05] jugg sustrik: I understand that, however, I think the SQL example (whether the results are images or something else) has its use case as well. The client knows that it wants an entire set of data, it doesn't know what comprises that set of data, and so it can't ask for each part individually.
[15:06] jugg But it makes sense from an efficiency stand point to break the set of data into individual parts for transmission and processing. Certainly the smarts could be layered ontop of 0MQ for getting each part individually, but greatly simplifies things by having 0MQ support this natively.
[15:06] sustrik jugg: understood
[15:07] sustrik what you have in mind is some kind of "terminator" message
[15:07] sustrik a message that says "this is the end of a message group"
[15:08] sustrik however, my feeling is that this kind of feature should be layered on top of 0MQ
[15:08] jugg Well, what I want is a single REQ message to be able to receive multiple REP messages - whatever that looks like.
[15:08] jugg the way multipart messaging works at the API level works for this very well. It is just the implementation that does not.
[15:09] sustrik ok, i see
[15:10] sustrik the scenario makes sense
[15:10] sustrik the implications are non trivial
[15:10] sustrik if server X1 sends a first row, then halts for an hour
[15:10] sustrik the client would read the first low
[15:10] sustrik then halt waiting for and hour
[15:11] sustrik although there may be other resultssets available from server X2
[15:11] sustrik this cannot happen with simple REQ socket
[15:11] sustrik but it can happen with XREQ
[15:11] sustrik it's complex stuff, lot of space to experiment
[15:12] jugg I'm not sure what you meant be other result sets... subsequent REQ can't be made until the REP is satisfied... this is no different than the current REQ/REP behavior... if I send a REQ, it waits an indefinite time for a REP.
[15:12] sustrik the problem is that there may be a queue in the middle
[15:13] sustrik the queue has to be able to process multiple requsts at the same time
[15:13] sustrik otherwise it would work in lock-step fashion
[15:13] sustrik and the scalability would go soutg
[15:13] sustrik south
[15:14] sustrik so the queue (composed on XREQ and XREP socket) would have to work on message-part-scale
[15:14] sustrik which is doable, but not the state of affairs right now
[15:14] sustrik if you are interested in the topic feel free to propose a solution
[15:15] jugg I guess from my no-understanding of the internals, looking at it from the point of view that multi-part messaging exists and works, then only (and maybe this is the sticking point) is to make the multi-parts non stacking on either end.
[15:16] jugg "then only" => "the only change"
[15:17] sustrik the main problem is in the middle
[15:17] sustrik how would you achieve that queue isn't stuck when there's a half-sent mutli-part message being processed
[15:17] sustrik and the sender dies without terminating it?
[15:18] sustrik also, you need fairness guarantees, so the queue cannot process very long recordset in a single go
[15:18] jugg You've mentioned this "middle queues" before, I'm still at a loss on them and what they are... :/
[15:18] sustrik instead it has to assign it a timeslice, process part of it, then move to another clients etc.
[15:19] sustrik "queue device"
[15:19] sustrik it's a component both requesters and repliers can connect to
[15:19] sustrik it then load balances the requests and routes the replies back
[15:20] sustrik see queue.cpp
[15:36] travlr sustrik: stuff like your previous conversation need to go in faq or clarified elsewhere etc.
[15:36] travlr i'll be scrubbing the irc and mail list eventually
[15:36] sustrik you mean the message part stuff?
[15:37] travlr well anything with nuance
[15:37] sustrik the question is what should go to FAQ, what should go to docs and what should go elsewhere
[15:38] sustrik stacking the technical info into FAQ is probably not the best solution possible
[15:38] travlr i'm just saying that i'll keep in mind these issues as i go and will help any way i can with stuff like docs
[15:38] sustrik yes, i'm just thinking out aloud
[15:39] sustrik maybe there's some kind of "ideology" document missing
[15:39] travlr thats true for all of foss
[15:40] sustrik i.e. not the strict technical reference
[15:40] sustrik but some talk about what are individual features intended for
[15:40] jugg sustrik: what happens with the current multipart if the sender dies while the I/O thread is sending out the messages? Same problem, no?
[15:40] sustrik and why they are designed in the way they are etc.
[15:40] travlr sustrik: we'll have a conversation about this soon
[15:40] travlr and i'll go to town
[15:41] sustrik jugg: no, the incomplete messages are rolled back in that case
[15:41] sustrik trvlr: ok
[15:48] jugg sustrik: Could the internals track whether the final message in a non atomic multipart message has been received, and if it has not, then the recv function return an error if the internals detect a disconnect of the sender? Failing such a possibility, I'd say if the only thing hindering this capability are intermediate queues, then document the risk and recommend not using them for this particular usage.
[15:50] jugg I haven't really understood why these "devices" are part of the core implementation anyway, they do not require access (afaict) to the internals of 0MQ, and could be implemented as stand alone applications/libraries, or even just example code.
[15:53] sustrik jugg: yes, right now the implementation of devices is pretty trivial
[15:53] sustrik in the future it's going to be more tightly integrated with the core
[15:54] sustrik as for the usage of deivces have a look here:
[15:55] sustrik that's pretty straightforward usage of queue device
[15:56] sustrik with that in mind, try to put down your non-atomic multi-part messages idea into email and send it to the mailing list
[15:56] sustrik it would be good to see some discussion on the topic
[15:59] sustrik btw, i would suggest to name these "stream messages"; it's shorter that "non-atomic multi-part messages"
[16:07] jugg sustrik: will do
[16:08] jugg thanks for working through all of that. I need to go back and dig into that possible fair queue issue now.
[16:09] sustrik you are welcome
[16:35] CIA-17 zeromq2: 03Pieter Hintjens 07master * r1dda8a2 10/ src/msg_store.cpp : Used more expressive variable names -