ZeroMq IRC Log

Friday June 25, 2010

[Time] Name	Message
[10:38] vtl	sustrik: hi!
[10:38] sustrik	hi
[10:39] vtl	sustrik: question: http://paste.lisp.org/display/111844
[10:40] sustrik	errno is thread-local
[10:40] sustrik	so you get err1
[10:41] vtl	sustrik: cool, 10x! as i thought... we have different opinion with a guy who forked cl-zmq :)
[10:41] sustrik	:)
[10:41] sustrik	it's POSIX behaviour
[10:41] vtl	yes
[10:41] sustrik	let me find the doc
[10:42] vtl	man errno, i think
[10:43] sustrik	right
[10:43] sustrik	it's ISO C
[10:59] vtl	I think I understood... In managed environments (like lisp, python, java, .net) thread may be interrupted by runtime for garbage collecting. if GC calls libraries functions or syscalls, it may clobber this thread's errno. When such event happens in between zmq_foo() and zmq_errno(), then zmq_errno() will return wrong errno.
[11:04] sustrik	right, that may happen
[11:19] sustrik	vtl: thinking about it
[11:20] sustrik	in theory, it would be possible to solve the problem in 0mq itself
[11:20] sustrik	solve the errno into 0mq-own thread-local variable
[11:20] sustrik	and return that one from zmq_errno
[11:20] vtl	I think this in not worth trying
[11:21] vtl	because other foreign libraries will fail in the same way :)
[11:21] sustrik	do you have a better solution?
[11:21] sustrik	yes, definitely
[11:21] sustrik	how is this solved in cl?
[11:21] vtl	this is generally not solved in cl. but one paricular commercial version of CL (this is Allegro) has workaround for it
[11:22] sustrik	ok, isee
[11:22] vtl	I think, it is possible to hack CFFI library to solve such kind of problems
[11:22] vtl	of course, other languages still problematic in this place
[11:34] cremes	sustrik: why not make errno a member/property of the context class?
[11:34] cremes	the system errno could be copied into that member after each 0mq call
[11:43] sustrik	cremes: a socket option
[11:44] sustrik	same as with Berkeley sockets (SO_ERROR)
[11:44] sustrik	yes, this sounds more sane than zmq_errno
[11:44] CIA-17	zeromq2: 03Martin Hurton 07master * rfca2e8e 10/ (7 files): Add SWAP support - http://bit.ly/bRzKds
[11:47] cremes	right, move it to the socket class (why would i suggest context? d'oh!)
[11:50] cremes	though that's another api change/break
[11:56] sustrik	cremes: right
[11:56] sustrik	in theory it may be done in 2 steps
[11:57] sustrik	1. add the error socket option
[11:57] sustrik	2. remove zmq_errno
[11:57] sustrik	the latter can be done when major version number is bumped
[12:30] jugg	sustrik: recent commit "devices exit in case of context termination" 11891d : src/forwarder.cpp lines 33 and 40, I believe are missing the necessary "rc = " assignment?
[12:31] jugg	hmm, same thing in src/streamer.cpp
[12:32] sustrik	let me see
[12:34] sustrik	oops
[12:34] sustrik	let me correct it!
[12:41] jugg	Using C++ bindings, it seems that the only time one needs to do a msg.rebuild is before a send. But between a send and a recv or between a recv and another recv this is not necessary, as it would appear the zmq_recv internals close a message then re-initializes it as needed. Is this correct?
[12:43] jugg	(the same could be stated/asked for the C api, restating with message close/init pairing instead of the C++ message rebuild)
[12:44] sustrik	ok, fixed
[12:44] sustrik	jugg: yes
[12:44] sustrik	the only use for rebuild is when you have a message as a member variable of a class
[12:45] sustrik	then you want to say resize it
[12:45] sustrik	you would have to destroy it and reinistantiate it
[12:45] sustrik	but that's not possible becasue it's a member variable!
[12:45] sustrik	so you would have to allocate it dynamically or something...
[12:45] sustrik	instead, you can simply call rebuild
[12:48] jugg	ok, but whether you destroy or rebuild, this is only necessary if you want to send a message of a different size. Anytime you want to recv a message, there is no reason to free the associated memory from previous use of the message... eg, the zmq internals aren't going to leak memory previously allocated for a message.
[12:50] jugg	I'm talking about the zmq_msg_t memory not an instantiated C++ message class.
[12:55] sustrik	jugg: yes
[12:55] sustrik	i think it's mentioned in the docs
[12:55] sustrik	let me see...
[12:55] sustrik	zmq_recv(3):
[12:55] sustrik	Any content previously stored in msg
[12:55] sustrik	shall be properly deallocated.
[13:03] jugg	sigh I get caught up reading/ walking through the code and forget about the documentation. appologies.
[13:04] sustrik	:)
[13:28] jugg	I have a setup where a SUB socket binds a TCP port on two different interfaces. There are two PUB sockets that connect, one to each interface of the SUB socket. I experienced an instance with this setup that messages quit flowing from one of the publishers to the subscriber while messages continued to flow from the other publisher.
[13:28] jugg	However there were no errors, or anything to indicate anything was wrong besides the fact the messages weren't flowing. At the OS level, the TCP port for the "dead" connection was still active, with one in LISTEN state, and the other in ESTABLISHED for the SUB and PUB socket respectively.
[13:29] jugg	Restarting the publisher application restored communications.
[13:30] jugg	If this happens in the future, any suggestions on how to inspect what is going on?
[13:33] sustrik	jugg: my guess would be that there's a loophole in fair queueing algorithm somewhere
[13:33] sustrik	if there are messages available from both publshers SUB socket round robins between them
[13:34] sustrik	what you describe looks like SUB erroneously believes one of the pipes has no messages
[13:34] sustrik	(although it does)
[13:35] sustrik	and doesn't include it into the round robit
[13:35] sustrik	robin
[13:35] jugg	where should I be looking in the zmq code?
[13:36] sustrik	ljugg: are use using zmq_poll or just zmq_recv?
[13:37] jugg	just zmq_recv
[13:37] sustrik	then it's fq_t::recv
[13:38] sustrik	fq.cpp:81
[13:54] jugg	change to topic for a moment, working through that code is going to take a bit... on multi-part messages, are the parts stacked up on the sending side or the receiving side?
[14:00] sustrik	jugg: both sides
[14:00] sustrik	the rule is that they are stacked on the write side of the pipe
[14:01] sustrik	one pipe being between sender thread and sender's I/O thread
[14:01] sustrik	other one being between receiver's I/O thread and receiver thread itself
[14:27] jugg	ok, so before they reach the receivers I/O thread, they've been stacked up in the sender's I/O thread. Then they are all sent to the receiver, and the receiver I/O thread stacks them up until they are all received before passing them off to the receiver thread, yes?
[14:30] sustrik	yes
[14:38] jugg	ok, so why expose the multipart concept to the receiving side at all then? Why not assemble it all into a single message for final delivery? The above structure provides no benefit for reducing total transfer time, nor allowing the receiving end to work on parts of the messages as they come in, thus reducing total processing time.
[14:39] jugg	It seems to me that either the receiving side should just get a single message delivered to the receiver thread, or that it shouldn't be atomic.
[14:41] sustrik	jugg: the goal here is to allow for some basic structure in the message content
[14:42] sustrik	so if sender has say 3 big matrices in different places in memory
[14:42] sustrik	he'll use multi-part message as a means to achieve zero-copy
[14:43] sustrik	however, he still wants to tell the boundaries between matrices on the receiving side
[14:43] sustrik	that's why boundaries between message parts are honoured
[14:44] sustrik	0mq uses this mechanism under the cover btw to distinguish 0mq-specific data on the wire from the user data
[14:48] jugg	ok, that makes sense. So, perhaps two alternate multi-part implementations feature requests then: 1. allow the sending side to send each part immediately, and only stack them on the receiving side I/O thread. 2. allow non atomic multi-part messaging.
[14:49] sustrik	1. makes sense
[14:50] sustrik	2. what would that be good for?
[14:51] jugg	2. A REQ is made (ie SQL query) and the REP has multiple rows, if it was non atomic, then each row could be sent back and be operated on without waiting for the entire set to arrive.
[14:52] travlr	my guess might be in a stream processing sense of individual message parts.
[14:52] travlr	yeah, what he said :)
[14:53] jugg	Another use is a REQ is made, and something more intensive like, a file set - a bunch of images - is returned. These images need to be resized. There is no reason to wait for the entire set to arrive.
[14:58] sustrik	jugg: i would say each image (or row) is a separate message in these scenarios
[14:58] sustrik	the rationale is that all the elements in the set are of the same type
[14:58] sustrik	and thus eligible for parallelised processing or similar
[14:59] sustrik	message parts make sense where there are different elements concatenated into a single message
[14:59] sustrik	for example 0mq-routing-data + user-data
[14:59] travlr	or a topic + user-data
[14:59] sustrik	yes
[15:00] sustrik	different semantics is the key here
[15:00] sustrik	from this point of view atomicity makes perfect sense
[15:00] sustrik	it doesn't make sense to deliver just the routing data
[15:00] sustrik	or a topic
[15:01] sustrik	and say load-balance the user-content somewhere else
[15:02] travlr	seeing the big picture along with the nuances is important for the various concepts in 0mq, huh.
[15:02] sustrik	yes, this kind of thing is missing from docs :(
[15:02] sustrik	but anyway, i have no idea where it shouldb e put
[15:03] travlr	i want to help with docs in the near future
[15:03] travlr	i'm still studying it all though for now
[15:03] sustrik	do you have idea what exactly would you like to do?
[15:03] travlr	docs?
[15:04] sustrik	yes
[15:04] sustrik	I liked Nicholas' blog yesterday
[15:04] travlr	yes
[15:04] sustrik	it seems this kind of stuff is highly needed
[15:04] travlr	very much
[15:04] travlr	along the same vain martin
[15:04] travlr	s/vain/vane
[15:05] travlr	first i want to understand 0mq inside out, which is what i'm working on atm
[15:05] jugg	sustrik: I understand that, however, I think the SQL example (whether the results are images or something else) has its use case as well. The client knows that it wants an entire set of data, it doesn't know what comprises that set of data, and so it can't ask for each part individually.
[15:06] jugg	But it makes sense from an efficiency stand point to break the set of data into individual parts for transmission and processing. Certainly the smarts could be layered ontop of 0MQ for getting each part individually, but greatly simplifies things by having 0MQ support this natively.
[15:06] sustrik	jugg: understood
[15:07] sustrik	what you have in mind is some kind of "terminator" message
[15:07] sustrik	a message that says "this is the end of a message group"
[15:08] sustrik	however, my feeling is that this kind of feature should be layered on top of 0MQ
[15:08] jugg	Well, what I want is a single REQ message to be able to receive multiple REP messages - whatever that looks like.
[15:08] jugg	the way multipart messaging works at the API level works for this very well. It is just the implementation that does not.
[15:09] sustrik	ok, i see
[15:10] sustrik	the scenario makes sense
[15:10] sustrik	the implications are non trivial
[15:10] sustrik	if server X1 sends a first row, then halts for an hour
[15:10] sustrik	the client would read the first low
[15:10] sustrik	then halt waiting for and hour
[15:11] sustrik	although there may be other resultssets available from server X2
[15:11] sustrik	this cannot happen with simple REQ socket
[15:11] sustrik	but it can happen with XREQ
[15:11] sustrik	it's complex stuff, lot of space to experiment
[15:12] jugg	I'm not sure what you meant be other result sets... subsequent REQ can't be made until the REP is satisfied... this is no different than the current REQ/REP behavior... if I send a REQ, it waits an indefinite time for a REP.
[15:12] sustrik	the problem is that there may be a queue in the middle
[15:13] sustrik	the queue has to be able to process multiple requsts at the same time
[15:13] sustrik	otherwise it would work in lock-step fashion
[15:13] sustrik	and the scalability would go soutg
[15:13] sustrik	south
[15:14] sustrik	so the queue (composed on XREQ and XREP socket) would have to work on message-part-scale
[15:14] sustrik	which is doable, but not the state of affairs right now
[15:14] sustrik	if you are interested in the topic feel free to propose a solution
[15:15] jugg	I guess from my no-understanding of the internals, looking at it from the point of view that multi-part messaging exists and works, then only (and maybe this is the sticking point) is to make the multi-parts non stacking on either end.
[15:16] jugg	"then only" => "the only change"
[15:17] sustrik	the main problem is in the middle
[15:17] sustrik	how would you achieve that queue isn't stuck when there's a half-sent mutli-part message being processed
[15:17] sustrik	and the sender dies without terminating it?
[15:18] sustrik	also, you need fairness guarantees, so the queue cannot process very long recordset in a single go
[15:18] jugg	You've mentioned this "middle queues" before, I'm still at a loss on them and what they are... :/
[15:18] sustrik	instead it has to assign it a timeslice, process part of it, then move to another clients etc.
[15:19] sustrik	"queue device"
[15:19] sustrik	it's a component both requesters and repliers can connect to
[15:19] sustrik	it then load balances the requests and routes the replies back
[15:20] sustrik	see queue.cpp
[15:36] travlr	sustrik: stuff like your previous conversation need to go in faq or clarified elsewhere etc.
[15:36] travlr	i'll be scrubbing the irc and mail list eventually
[15:36] sustrik	you mean the message part stuff?
[15:37] travlr	well anything with nuance
[15:37] sustrik	the question is what should go to FAQ, what should go to docs and what should go elsewhere
[15:38] sustrik	stacking the technical info into FAQ is probably not the best solution possible
[15:38] travlr	i'm just saying that i'll keep in mind these issues as i go and will help any way i can with stuff like docs
[15:38] sustrik	yes, i'm just thinking out aloud
[15:39] sustrik	maybe there's some kind of "ideology" document missing
[15:39] travlr	thats true for all of foss
[15:40] sustrik	i.e. not the strict technical reference
[15:40] sustrik	but some talk about what are individual features intended for
[15:40] jugg	sustrik: what happens with the current multipart if the sender dies while the I/O thread is sending out the messages? Same problem, no?
[15:40] sustrik	and why they are designed in the way they are etc.
[15:40] travlr	sustrik: we'll have a conversation about this soon
[15:40] travlr	and i'll go to town
[15:41] sustrik	jugg: no, the incomplete messages are rolled back in that case
[15:41] sustrik	trvlr: ok
[15:48] jugg	sustrik: Could the internals track whether the final message in a non atomic multipart message has been received, and if it has not, then the recv function return an error if the internals detect a disconnect of the sender? Failing such a possibility, I'd say if the only thing hindering this capability are intermediate queues, then document the risk and recommend not using them for this particular usage.
[15:50] jugg	I haven't really understood why these "devices" are part of the core implementation anyway, they do not require access (afaict) to the internals of 0MQ, and could be implemented as stand alone applications/libraries, or even just example code.
[15:53] sustrik	jugg: yes, right now the implementation of devices is pretty trivial
[15:53] sustrik	in the future it's going to be more tightly integrated with the core
[15:54] sustrik	as for the usage of deivces have a look here: http://www.zeromq.org/blog:multithreaded-server
[15:55] sustrik	that's pretty straightforward usage of queue device
[15:56] sustrik	with that in mind, try to put down your non-atomic multi-part messages idea into email and send it to the mailing list
[15:56] sustrik	it would be good to see some discussion on the topic
[15:59] sustrik	btw, i would suggest to name these "stream messages"; it's shorter that "non-atomic multi-part messages"
[16:07] jugg	sustrik: will do
[16:08] jugg	thanks for working through all of that. I need to go back and dig into that possible fair queue issue now.
[16:09] sustrik	you are welcome
[16:35] CIA-17	zeromq2: 03Pieter Hintjens 07master * r1dda8a2 10/ src/msg_store.cpp : Used more expressive variable names - http://bit.ly/aynV9A