[Time] Name | Message |
[10:38] vtl
|
sustrik: hi!
|
[10:38] sustrik
|
hi
|
[10:39] vtl
|
sustrik: question: http://paste.lisp.org/display/111844
|
[10:40] sustrik
|
errno is thread-local
|
[10:40] sustrik
|
so you get err1
|
[10:41] vtl
|
sustrik: cool, 10x! as i thought... we have different opinion with a guy who forked cl-zmq :)
|
[10:41] sustrik
|
:)
|
[10:41] sustrik
|
it's POSIX behaviour
|
[10:41] vtl
|
yes
|
[10:41] sustrik
|
let me find the doc
|
[10:42] vtl
|
man errno, i think
|
[10:43] sustrik
|
right
|
[10:43] sustrik
|
it's ISO C
|
[10:59] vtl
|
I think I understood... In managed environments (like lisp, python, java, .net) thread may be interrupted by runtime for garbage collecting. if GC calls libraries functions or syscalls, it may clobber this thread's errno. When such event happens in between zmq_foo() and zmq_errno(), then zmq_errno() will return wrong errno.
|
[11:04] sustrik
|
right, that may happen
|
[11:19] sustrik
|
vtl: thinking about it
|
[11:20] sustrik
|
in theory, it would be possible to solve the problem in 0mq itself
|
[11:20] sustrik
|
solve the errno into 0mq-own thread-local variable
|
[11:20] sustrik
|
and return that one from zmq_errno
|
[11:20] vtl
|
I think this in not worth trying
|
[11:21] vtl
|
because other foreign libraries will fail in the same way :)
|
[11:21] sustrik
|
do you have a better solution?
|
[11:21] sustrik
|
yes, definitely
|
[11:21] sustrik
|
how is this solved in cl?
|
[11:21] vtl
|
this is generally not solved in cl. but one paricular commercial version of CL (this is Allegro) has workaround for it
|
[11:22] sustrik
|
ok, isee
|
[11:22] vtl
|
I think, it is possible to hack CFFI library to solve such kind of problems
|
[11:22] vtl
|
of course, other languages still problematic in this place
|
[11:34] cremes
|
sustrik: why not make errno a member/property of the context class?
|
[11:34] cremes
|
the system errno could be copied into that member after each 0mq call
|
[11:43] sustrik
|
cremes: a socket option
|
[11:44] sustrik
|
same as with Berkeley sockets (SO_ERROR)
|
[11:44] sustrik
|
yes, this sounds more sane than zmq_errno
|
[11:44] CIA-17
|
zeromq2: 03Martin Hurton 07master * rfca2e8e 10/ (7 files): Add SWAP support - http://bit.ly/bRzKds
|
[11:47] cremes
|
right, move it to the socket class (why would i suggest context? d'oh!)
|
[11:50] cremes
|
though that's another api change/break
|
[11:56] sustrik
|
cremes: right
|
[11:56] sustrik
|
in theory it may be done in 2 steps
|
[11:57] sustrik
|
1. add the error socket option
|
[11:57] sustrik
|
2. remove zmq_errno
|
[11:57] sustrik
|
the latter can be done when major version number is bumped
|
[12:30] jugg
|
sustrik: recent commit "devices exit in case of context termination" 11891d : src/forwarder.cpp lines 33 and 40, I believe are missing the necessary "rc = " assignment?
|
[12:31] jugg
|
hmm, same thing in src/streamer.cpp
|
[12:32] sustrik
|
let me see
|
[12:34] sustrik
|
oops
|
[12:34] sustrik
|
let me correct it!
|
[12:41] jugg
|
Using C++ bindings, it seems that the only time one needs to do a msg.rebuild is before a send. But between a send and a recv or between a recv and another recv this is not necessary, as it would appear the zmq_recv internals close a message then re-initializes it as needed. Is this correct?
|
[12:43] jugg
|
(the same could be stated/asked for the C api, restating with message close/init pairing instead of the C++ message rebuild)
|
[12:44] sustrik
|
ok, fixed
|
[12:44] sustrik
|
jugg: yes
|
[12:44] sustrik
|
the only use for rebuild is when you have a message as a member variable of a class
|
[12:45] sustrik
|
then you want to say resize it
|
[12:45] sustrik
|
you would have to destroy it and reinistantiate it
|
[12:45] sustrik
|
but that's not possible becasue it's a member variable!
|
[12:45] sustrik
|
so you would have to allocate it dynamically or something...
|
[12:45] sustrik
|
instead, you can simply call rebuild
|
[12:48] jugg
|
ok, but whether you destroy or rebuild, this is only necessary if you want to send a message of a different size. Anytime you want to recv a message, there is no reason to free the associated memory from previous use of the message... eg, the zmq internals aren't going to leak memory previously allocated for a message.
|
[12:50] jugg
|
I'm talking about the zmq_msg_t memory not an instantiated C++ message class.
|
[12:55] sustrik
|
jugg: yes
|
[12:55] sustrik
|
i think it's mentioned in the docs
|
[12:55] sustrik
|
let me see...
|
[12:55] sustrik
|
zmq_recv(3):
|
[12:55] sustrik
|
Any content previously stored in msg
|
[12:55] sustrik
|
shall be properly deallocated.
|
[13:03] jugg
|
*sigh* I get caught up reading/ walking through the code and forget about the documentation. appologies.
|
[13:04] sustrik
|
:)
|
[13:28] jugg
|
I have a setup where a SUB socket binds a TCP port on two different interfaces. There are two PUB sockets that connect, one to each interface of the SUB socket. I experienced an instance with this setup that messages quit flowing from one of the publishers to the subscriber while messages continued to flow from the other publisher.
|
[13:28] jugg
|
However there were no errors, or anything to indicate anything was wrong besides the fact the messages weren't flowing. At the OS level, the TCP port for the "dead" connection was still active, with one in LISTEN state, and the other in ESTABLISHED for the SUB and PUB socket respectively.
|
[13:29] jugg
|
Restarting the publisher application restored communications.
|
[13:30] jugg
|
If this happens in the future, any suggestions on how to inspect what is going on?
|
[13:33] sustrik
|
jugg: my guess would be that there's a loophole in fair queueing algorithm somewhere
|
[13:33] sustrik
|
if there are messages available from both publshers SUB socket round robins between them
|
[13:34] sustrik
|
what you describe looks like SUB erroneously believes one of the pipes has no messages
|
[13:34] sustrik
|
(although it does)
|
[13:35] sustrik
|
and doesn't include it into the round robit
|
[13:35] sustrik
|
robin
|
[13:35] jugg
|
where should I be looking in the zmq code?
|
[13:36] sustrik
|
ljugg: are use using zmq_poll or just zmq_recv?
|
[13:37] jugg
|
just zmq_recv
|
[13:37] sustrik
|
then it's fq_t::recv
|
[13:38] sustrik
|
fq.cpp:81
|
[13:54] jugg
|
change to topic for a moment, working through that code is going to take a bit... on multi-part messages, are the parts stacked up on the sending side or the receiving side?
|
[14:00] sustrik
|
jugg: both sides
|
[14:00] sustrik
|
the rule is that they are stacked on the write side of the pipe
|
[14:01] sustrik
|
one pipe being between sender thread and sender's I/O thread
|
[14:01] sustrik
|
other one being between receiver's I/O thread and receiver thread itself
|
[14:27] jugg
|
ok, so before they reach the receivers I/O thread, they've been stacked up in the sender's I/O thread. Then they are all sent to the receiver, and the receiver I/O thread stacks them up until they are all received before passing them off to the receiver thread, yes?
|
[14:30] sustrik
|
yes
|
[14:38] jugg
|
ok, so why expose the multipart concept to the receiving side at all then? Why not assemble it all into a single message for final delivery? The above structure provides no benefit for reducing total transfer time, nor allowing the receiving end to work on parts of the messages as they come in, thus reducing total processing time.
|
[14:39] jugg
|
It seems to me that either the receiving side should just get a single message delivered to the receiver thread, or that it shouldn't be atomic.
|
[14:41] sustrik
|
jugg: the goal here is to allow for some basic structure in the message content
|
[14:42] sustrik
|
so if sender has say 3 big matrices in different places in memory
|
[14:42] sustrik
|
he'll use multi-part message as a means to achieve zero-copy
|
[14:43] sustrik
|
however, he still wants to tell the boundaries between matrices on the receiving side
|
[14:43] sustrik
|
that's why boundaries between message parts are honoured
|
[14:44] sustrik
|
0mq uses this mechanism under the cover btw to distinguish 0mq-specific data on the wire from the user data
|
[14:48] jugg
|
ok, that makes sense. So, perhaps two alternate multi-part implementations feature requests then: 1. allow the sending side to send each part immediately, and only stack them on the receiving side I/O thread. 2. allow non atomic multi-part messaging.
|
[14:49] sustrik
|
1. makes sense
|
[14:50] sustrik
|
2. what would that be good for?
|
[14:51] jugg
|
2. A REQ is made (ie SQL query) and the REP has multiple rows, if it was non atomic, then each row could be sent back and be operated on without waiting for the entire set to arrive.
|
[14:52] travlr
|
my guess might be in a stream processing sense of individual message parts.
|
[14:52] travlr
|
yeah, what he said :)
|
[14:53] jugg
|
Another use is a REQ is made, and something more intensive like, a file set - a bunch of images - is returned. These images need to be resized. There is no reason to wait for the entire set to arrive.
|
[14:58] sustrik
|
jugg: i would say each image (or row) is a separate message in these scenarios
|
[14:58] sustrik
|
the rationale is that all the elements in the set are of the same type
|
[14:58] sustrik
|
and thus eligible for parallelised processing or similar
|
[14:59] sustrik
|
message parts make sense where there are different elements concatenated into a single message
|
[14:59] sustrik
|
for example 0mq-routing-data + user-data
|
[14:59] travlr
|
or a topic + user-data
|
[14:59] sustrik
|
yes
|
[15:00] sustrik
|
different semantics is the key here
|
[15:00] sustrik
|
from this point of view atomicity makes perfect sense
|
[15:00] sustrik
|
it doesn't make sense to deliver just the routing data
|
[15:00] sustrik
|
or a topic
|
[15:01] sustrik
|
and say load-balance the user-content somewhere else
|
[15:02] travlr
|
seeing the big picture along with the nuances is important for the various concepts in 0mq, huh.
|
[15:02] sustrik
|
yes, this kind of thing is missing from docs :(
|
[15:02] sustrik
|
but anyway, i have no idea where it shouldb e put
|
[15:03] travlr
|
i want to help with docs in the near future
|
[15:03] travlr
|
i'm still studying it all though for now
|
[15:03] sustrik
|
do you have idea what exactly would you like to do?
|
[15:03] travlr
|
docs?
|
[15:04] sustrik
|
yes
|
[15:04] sustrik
|
I liked Nicholas' blog yesterday
|
[15:04] travlr
|
yes
|
[15:04] sustrik
|
it seems this kind of stuff is highly needed
|
[15:04] travlr
|
very much
|
[15:04] travlr
|
along the same vain martin
|
[15:04] travlr
|
s/vain/vane
|
[15:05] travlr
|
first i want to understand 0mq inside out, which is what i'm working on atm
|
[15:05] jugg
|
sustrik: I understand that, however, I think the SQL example (whether the results are images or something else) has its use case as well. The client knows that it wants an entire set of data, it doesn't know what comprises that set of data, and so it can't ask for each part individually.
|
[15:06] jugg
|
But it makes sense from an efficiency stand point to break the set of data into individual parts for transmission and processing. Certainly the smarts could be layered ontop of 0MQ for getting each part individually, but greatly simplifies things by having 0MQ support this natively.
|
[15:06] sustrik
|
jugg: understood
|
[15:07] sustrik
|
what you have in mind is some kind of "terminator" message
|
[15:07] sustrik
|
a message that says "this is the end of a message group"
|
[15:08] sustrik
|
however, my feeling is that this kind of feature should be layered on top of 0MQ
|
[15:08] jugg
|
Well, what I want is a single REQ message to be able to receive multiple REP messages - whatever that looks like.
|
[15:08] jugg
|
the way multipart messaging works at the API level works for this very well. It is just the implementation that does not.
|
[15:09] sustrik
|
ok, i see
|
[15:10] sustrik
|
the scenario makes sense
|
[15:10] sustrik
|
the implications are non trivial
|
[15:10] sustrik
|
if server X1 sends a first row, then halts for an hour
|
[15:10] sustrik
|
the client would read the first low
|
[15:10] sustrik
|
then halt waiting for and hour
|
[15:11] sustrik
|
although there may be other resultssets available from server X2
|
[15:11] sustrik
|
this cannot happen with simple REQ socket
|
[15:11] sustrik
|
but it can happen with XREQ
|
[15:11] sustrik
|
it's complex stuff, lot of space to experiment
|
[15:12] jugg
|
I'm not sure what you meant be other result sets... subsequent REQ can't be made until the REP is satisfied... this is no different than the current REQ/REP behavior... if I send a REQ, it waits an indefinite time for a REP.
|
[15:12] sustrik
|
the problem is that there may be a queue in the middle
|
[15:13] sustrik
|
the queue has to be able to process multiple requsts at the same time
|
[15:13] sustrik
|
otherwise it would work in lock-step fashion
|
[15:13] sustrik
|
and the scalability would go soutg
|
[15:13] sustrik
|
south
|
[15:14] sustrik
|
so the queue (composed on XREQ and XREP socket) would have to work on message-part-scale
|
[15:14] sustrik
|
which is doable, but not the state of affairs right now
|
[15:14] sustrik
|
if you are interested in the topic feel free to propose a solution
|
[15:15] jugg
|
I guess from my no-understanding of the internals, looking at it from the point of view that multi-part messaging exists and works, then only (and maybe this is the sticking point) is to make the multi-parts non stacking on either end.
|
[15:16] jugg
|
"then only" => "the only change"
|
[15:17] sustrik
|
the main problem is in the middle
|
[15:17] sustrik
|
how would you achieve that queue isn't stuck when there's a half-sent mutli-part message being processed
|
[15:17] sustrik
|
and the sender dies without terminating it?
|
[15:18] sustrik
|
also, you need fairness guarantees, so the queue cannot process very long recordset in a single go
|
[15:18] jugg
|
You've mentioned this "middle queues" before, I'm still at a loss on them and what they are... :/
|
[15:18] sustrik
|
instead it has to assign it a timeslice, process part of it, then move to another clients etc.
|
[15:19] sustrik
|
"queue device"
|
[15:19] sustrik
|
it's a component both requesters and repliers can connect to
|
[15:19] sustrik
|
it then load balances the requests and routes the replies back
|
[15:20] sustrik
|
see queue.cpp
|
[15:36] travlr
|
sustrik: stuff like your previous conversation need to go in faq or clarified elsewhere etc.
|
[15:36] travlr
|
i'll be scrubbing the irc and mail list eventually
|
[15:36] sustrik
|
you mean the message part stuff?
|
[15:37] travlr
|
well anything with nuance
|
[15:37] sustrik
|
the question is what should go to FAQ, what should go to docs and what should go elsewhere
|
[15:38] sustrik
|
stacking the technical info into FAQ is probably not the best solution possible
|
[15:38] travlr
|
i'm just saying that i'll keep in mind these issues as i go and will help any way i can with stuff like docs
|
[15:38] sustrik
|
yes, i'm just thinking out aloud
|
[15:39] sustrik
|
maybe there's some kind of "ideology" document missing
|
[15:39] travlr
|
thats true for all of foss
|
[15:40] sustrik
|
i.e. not the strict technical reference
|
[15:40] sustrik
|
but some talk about what are individual features intended for
|
[15:40] jugg
|
sustrik: what happens with the current multipart if the sender dies while the I/O thread is sending out the messages? Same problem, no?
|
[15:40] sustrik
|
and why they are designed in the way they are etc.
|
[15:40] travlr
|
sustrik: we'll have a conversation about this soon
|
[15:40] travlr
|
and i'll go to town
|
[15:41] sustrik
|
jugg: no, the incomplete messages are rolled back in that case
|
[15:41] sustrik
|
trvlr: ok
|
[15:48] jugg
|
sustrik: Could the internals track whether the final message in a non atomic multipart message has been received, and if it has not, then the recv function return an error if the internals detect a disconnect of the sender? Failing such a possibility, I'd say if the only thing hindering this capability are intermediate queues, then document the risk and recommend not using them for this particular usage.
|
[15:50] jugg
|
I haven't really understood why these "devices" are part of the core implementation anyway, they do not require access (afaict) to the internals of 0MQ, and could be implemented as stand alone applications/libraries, or even just example code.
|
[15:53] sustrik
|
jugg: yes, right now the implementation of devices is pretty trivial
|
[15:53] sustrik
|
in the future it's going to be more tightly integrated with the core
|
[15:54] sustrik
|
as for the usage of deivces have a look here: http://www.zeromq.org/blog:multithreaded-server
|
[15:55] sustrik
|
that's pretty straightforward usage of queue device
|
[15:56] sustrik
|
with that in mind, try to put down your non-atomic multi-part messages idea into email and send it to the mailing list
|
[15:56] sustrik
|
it would be good to see some discussion on the topic
|
[15:59] sustrik
|
btw, i would suggest to name these "stream messages"; it's shorter that "non-atomic multi-part messages"
|
[16:07] jugg
|
sustrik: will do
|
[16:08] jugg
|
thanks for working through all of that. I need to go back and dig into that possible fair queue issue now.
|
[16:09] sustrik
|
you are welcome
|
[16:35] CIA-17
|
zeromq2: 03Pieter Hintjens 07master * r1dda8a2 10/ src/msg_store.cpp : Used more expressive variable names - http://bit.ly/aynV9A
|