[Time] Name | Message |
[00:00] kenkeiter
|
sleeperbot: it was worth a try.. did you verify that you're only running one or two messaging threads?
|
[00:00] andrewvc
|
I've been curious as to how stable the node driver is
|
[00:00] kenkeiter
|
:/
|
[00:03] sleeperbot
|
I'm looking up how to do that
|
[00:03] sleeperbot
|
do you know what I can type in the command line to bring that info up?
|
[00:04] kenkeiter
|
sleeperbot: which platform?
|
[00:04] sleeperbot
|
unix
|
[00:04] sleeperbot
|
ubuntu karmic
|
[00:07] kenkeiter
|
htop might work.. haven't done it under *ni
|
[00:09] kenkeiter
|
http://manpages.ubuntu.com/manpages/lucid/man1/htop.1.html
|
[00:14] sleeperbot
|
I see 3 versions of my node.js stream and web servers
|
[00:14] sleeperbot
|
don't see anything related to zmq
|
[00:15] sleeperbot
|
killed the extraneous processes, will check if anything changed in cpu usage
|
[03:34] andrewvc
|
I assume that XREQ/XREP sockets apply backpressure in the same manner as PUSH/PULL and REQ/REP yes?
|
[11:28] CIA-20
|
zeromq2: 03Martin Lucina 07master * rbe159b6 10/ src/pipe.cpp : zmq::writer_t: Add missing test for swap - http://bit.ly/aGN4bM
|
[11:29] icy
|
sustrik: hi, is there any paper on the algorithm used for the lock-free queue?
|
[11:30] sustrik
|
icy: there's a very old article here:
|
[11:30] sustrik
|
http://www.zeromq.org/whitepapers:y-suite
|
[11:30] sustrik
|
lot of it doesn't apply any more
|
[11:31] sustrik
|
this is what still applies: "+ Table of Contents
|
[11:31] sustrik
|
- Table of Contents
|
[11:31] sustrik
|
FoldUnfold
|
[11:31] sustrik
|
Table of Contents
|
[11:31] sustrik
|
Introduction
|
[11:31] sustrik
|
Design
|
[11:31] sustrik
|
Performance
|
[11:31] sustrik
|
Configuration
|
[11:31] sustrik
|
Performance with polling
|
[11:31] sustrik
|
Performance without polling
|
[11:31] sustrik
|
Conclusion
|
[11:31] sustrik
|
HISTORICAL WHITEPAPER
|
[11:31] sustrik
|
Introduction
|
[11:32] sustrik
|
Y-suite is a set of components designed for ultra-efficient passing of messages between threads within a process. Y-suite is somehow similar to local sockets, however, it is much faster.
|
[11:32] sustrik
|
In version 0.1 of ÃMQ lightweight messaging kernel, the only y-suite component available is ypipe, a lock-free and wait-free implementation of a queue. In version 0.2 ypollset is added to allow thread to interchange messages with several other threads at the same time (similar to POSIX poll function). Component known as semaphore in version 0.1 is renamed to ysemaphore in version 0.2 to mark that it belongs to y-suite. Same way, spipe i
|
[11:32] sustrik
|
s renamed to ysocketpair.
|
[11:32] sustrik
|
Design
|
[11:32] sustrik
|
The basic means of transferring message between threads is ypipe. Messages are passed through a pipe in the standard write and read manner. Once the reader has no more messages to read from the pipe, it notifies the sender using passive synchronization and goes asleep. Passive synchronization means that the other thread is not notified directly using some kind of async signal, rather it will be notified once it tries to write the next me
|
[11:32] sustrik
|
ssage to the pipe. When this happens, writer becomes aware that reader is already asleep or at least going asleep at the moment. It knows that there is new message available, so it wakes the reader up using active synchronization, i.e. actively sending wake-up event to the other thread. Active synchronisation is not provided by ypipe itself, rather by other y-suite components, to be discussed bellow. Usage of ypipe is depicted on the fol
|
[11:32] sustrik
|
lowing sequence diagram:"
|
[11:32] sustrik
|
yuck
|
[11:32] sustrik
|
sorry
|
[11:33] sustrik
|
too much text, but the last paragraph is relevant
|
[11:33] sustrik
|
also see the diagram that follows the text above
|
[11:34] ekidd
|
Good morning! ZeroMQ is a really nice library.
|
[11:35] ekidd
|
If I'm using REQ/REP messaging with multiple servers, what happens if one server is asked to handle an unusually long-running request?
|
[11:36] ekidd
|
Do the clients just route requests to one of the available servers? Or do they continue to send requests to the busy server?
|
[11:36] sustrik
|
ekidd: if you set high watermark, it's queue gets eventually full and subsequent requests will be dispatches to other servers
|
[11:36] icy
|
sustrik: yea I've read that, I guess because it is single-reader single-write, it does not suffer from the ABA problem?
|
[11:37] ekidd
|
sustrik: Ah, OK. The useful high watermark in my case is very small: The servers are inherently single-threaded workers with long-running jobs. I want to keep them loaded.
|
[11:37] sustrik
|
icy: what's ABA?
|
[11:38] ekidd
|
I do, however, have lots of clients and servers.
|
[11:38] icy
|
sustrik: http://en.wikipedia.org/wiki/ABA_problem
|
[11:39] icy
|
sustrik: it's one of the main problems that lock-free queues have to overcome
|
[11:39] guido_g
|
it takes an unusually long time to complete the request
|
[11:40] ekidd
|
icy: My clients and servers are different machines, so I don't think the lock-free stuff is relevant. But I might be confused.
|
[11:40] sustrik
|
ekidd: that's a different conversion going on :)
|
[11:40] ekidd
|
Ah, OK. I was confused. :-)
|
[11:41] guido_g
|
ekidd: did you see that req/req is locked to the send/recv order?
|
[11:41] sustrik
|
ekidd: there's no such thing in 0MQ as explicit ack
|
[11:41] sustrik
|
so there's no way for it to work in lock-step fashion
|
[11:41] ekidd
|
guido_g: Yeah, that works for me.
|
[11:42] sustrik
|
icy: it's basically a two step process
|
[11:42] guido_g
|
same for the rep side (the server)
|
[11:42] ekidd
|
I basically have a farm of Windows workers that take 0.1 to (say) 60 seconds to process a job, and idle time costs money. There's one worker per server.
|
[11:42] guido_g
|
see the user guide for lots of examples and ideas
|
[11:43] sustrik
|
icy: while there are messages to be read the synchronisation is done simply by moving a pointer in the linked list in atomic manner
|
[11:43] sustrik
|
icy: when there are no messages to be read, the pointer becomes NULL
|
[11:43] ekidd
|
I don't mind locking the order of responses the way req/rep does: I'm talking to expensive, single-threaded Windows libraries in any case.
|
[11:44] icy
|
sustrik: understood so far and it seems it does not suffer from aba, just was curious if there was real proof of the correctness of the algorithm
|
[11:44] sustrik
|
icy: reader goes asleep and standard inter-thread mechanism (socketpair) is used to wake it up
|
[11:44] ekidd
|
But I want to maximize utilization of those expensive libraries.
|
[11:44] sustrik
|
icy: no
|
[11:44] sustrik
|
want to prove it?
|
[11:44] icy
|
that would take more time than I have probably :)
|
[11:45] ekidd
|
As long as zeromq clients respect the individual server's high water marks and route requests to another worker, everything will work fine.
|
[11:46] ekidd
|
I'm going to write some tests (of course). I just wanted to know whether I was even trying something sane. :-)
|
[11:48] ekidd
|
Many thanks for your advice, folks!
|
[11:51] guido_g
|
ekidd: did you read http://api.zeromq.org/zmq_socket.html, there is something on hwm
|
[11:54] ekidd
|
guido_g: Excellent. It definitely has the right semantics. I'll still need to find out whether it does the right thing, performance-wise, with large messages and queues that are often at their high water marks.
|
[13:44] cremes
|
while writing some specs for my bindings this weekend i came across a few issues with SWAP, RECOVERY_IVL and RATE
|
[13:45] cremes
|
all 3 of those take signed 64-bit integers for input
|
[13:45] cremes
|
they also do *not* return an error when passed a negative number even though that doesn't make any sense
|
[13:45] cremes
|
should the library return an error for negative numbers or should my bindings take care of that issue?
|
[13:57] ptrb
|
so I have this: http://pastebin.com/dXEveLCx
|
[13:58] ptrb
|
I start the server, it sits at zmq_recv(), great; i run the client, it runs fine and exits, but the server never receives anything. ideas?
|
[14:00] pieterh
|
cremes: i think all the setsockopt types need to be reviewed for 3.0
|
[14:00] cremes
|
pieterh: ok; so should i open bugs for those against the 2.1.x branch?
|
[14:00] pieterh
|
but certainly if they are signed get a negative value that should return EINVAL
|
[14:00] pieterh
|
yup, even 2.0.x IMO
|
[14:00] cremes
|
ok, i'll do that now
|
[14:01] pieterh
|
ptrb: looking at it...
|
[14:02] ptrb
|
pieterh: thx; I'm guessing there's some setup step I've overlooked
|
[14:02] pieterh
|
ptrb: try 'ps'
|
[14:02] pieterh
|
imo you have a second copy of the server running
|
[14:02] pieterh
|
(though it would assert then...)
|
[14:03] pieterh
|
sorry, forget I said that plz
|
[14:03] ptrb
|
hmm, no, but maybe something else is sitting on 5001, let me try changing that
|
[14:03] pieterh
|
ptrb: client writes a message and then closes & exits
|
[14:03] pieterh
|
two things: (a) it should wait for a reply
|
[14:03] pieterh
|
(b) if it does not want to wait, it can't exit immediately
|
[14:03] pieterh
|
you need to read the users guide
|
[14:04] ptrb
|
I have.
|
[14:04] pieterh
|
0mq/2.0.x loses data if you close the socket while there is data in flight
|
[14:04] cremes
|
ptrb: are you starting the server first?
|
[14:04] ptrb
|
Of course.
|
[14:04] pieterh
|
send/close is not going to work
|
[14:04] pieterh
|
send/recv/close is ok
|
[14:04] pieterh
|
send/sleep/close is ok
|
[14:04] ptrb
|
OK, so, do I need to recv() in the clie... k
|
[14:04] cremes
|
ah yes, that's right
|
[14:04] cremes
|
do a sleep before exiting
|
[14:04] ptrb
|
even if I don't post anything back explicitly?
|
[14:04] pieterh
|
prtb: either a recv
|
[14:05] pieterh
|
prtb if you're using REQ and REP sockets, you should be doing send/recv and recv/send
|
[14:05] pieterh
|
if you want to just send 1 message as such use PUSH/PULL
|
[14:05] pieterh
|
it's not a biggie
|
[14:05] pieterh
|
the problem here is not giving the client process time to send its data
|
[14:05] ptrb
|
I'm doing something vaguely RPC-ish, so I guess if I want to represent a void blah(); I still have to send something back
|
[14:06] pieterh
|
or else use XREQ/XREP
|
[14:06] ptrb
|
yeah it makes sense, sure. thanks. i guess it's just not explicit anywhere in the docs (afaict)
|
[14:06] pieterh
|
rtfug... :-)
|
[14:06] pieterh
|
it is explicit in there
|
[14:06] ptrb
|
i have; if you want to point me to the sentence in question I'm happy to be made a fool
|
[14:07] pieterh
|
Note that we do sleep (1); before exiting the ventilator. This is a hack that gets around ÃMQ/2.0's design, which discards messages that have not yet been sent, if you exit the program too soon. If you are using ÃMQ/2.1 you can remove this sleep statement.
|
[14:07] ptrb
|
eh.
|
[14:08] pieterh
|
http://www.zeromq.org/docs:user-guide-1#toc7
|
[14:08] pieterh
|
it's the first example that has this problem, so I explain it there
|
[14:08] pieterh
|
the hello world client waits for an answer
|
[14:08] pieterh
|
and the pubsub example never exits
|
[14:09] pieterh
|
maybe i should put it in bold...
|
[14:09] pieterh
|
and repeat this, it's a common fault
|
[14:11] ptrb
|
if you're willing to take some constructive criticism about the documentation, i'd say that while example-based docs are great, when I have a specific problem (like this) I find there isn't really a way I can find a solution; there's no idioms or implementation details or whatever to search through (as far as I've found)
|
[14:13] ptrb
|
but!! but but, thank you :)
|
[14:19] ptrb
|
hmm, recv on the client side hangs... is there not some zmq_flush or something I can call?
|
[14:20] guido_g
|
no
|
[14:20] ptrb
|
poop :|
|
[14:21] guido_g
|
pardon?
|
[14:21] ptrb
|
that was an expression of mild disappointment
|
[15:35] pieterh
|
ptrb: still there?
|
[15:36] pieterh
|
sorry, was in a meeting
|
[15:36] ptrb
|
yeah sure
|
[15:36] pieterh
|
making a problem driven section in the guide would be good
|
[15:37] pieterh
|
did you find out why your client hangs?
|
[15:37] ptrb
|
No, I just threw a sleep in there and moved on to bigger, even more problematic things :)
|
[15:37] ptrb
|
a problem-driven section would be good, but it'll never be comprehensive
|
[15:38] ptrb
|
FWIW I think a good documentation model would be ZeroC's ICE, which has a really comprehensive .pdf
|
[15:38] pieterh
|
"did not get a message" is a pretty classic stumbling block
|
[15:38] ptrb
|
yeah, fair
|
[15:38] pieterh
|
i'll write a flowchart
|
[15:40] ptrb
|
now, i'm working on an implementation based on the multithreaded code in the user guide, and i'm getting infinite size-0 messages on the server side after sending one legitimate message from a client
|
[15:40] ptrb
|
ever hear of something like this?
|
[15:42] ptrb
|
sorry, based on the multithreaded server in the *introduction* doc
|
[15:43] cremes
|
ptrb: i've never seen that... you say your code is "based on" the example; it's always a good idea to start from code that you *know* works and modify from there
|
[15:43] cremes
|
sounds like your mods broke it
|
[15:44] cremes
|
the easiest way to find the failure is to revert back to the original "good" code and slowly modify it to your specifications
|
[15:44] ptrb
|
yeah. i know. i'm trying to drop the server into an existing process to provide a zmq "layer", so there's not really any way to iterate my way to where I am now.
|
[15:45] ptrb
|
i guess i can try taking out some functionality.
|
[15:45] cremes
|
did you change the code that sends 0mq messages?
|
[15:48] ptrb
|
yes; in ways i initially thought were inconsequential, but i suppose i'm in an assumption-revalidating mood :)
|
[15:49] ptrb
|
as a meta-comment, it's really great you guys are hanging out on irc to help folks; zmq is a great project and this is a great resource.
|
[15:50] ptrb
|
aha! so, if i zmq_recv(), get a message, and don't zmq_send() something in the server, subsequent zmq_recv()s have the effect of not blocking
|
[15:51] ptrb
|
...which seems quite strange to me
|
[15:51] cremes
|
ptrb: what kind of socket are you using on this server side?
|
[15:53] cremes
|
because that behavior doesn't sound right; the zmq_recv() call is returning an error, right?
|
[15:54] ptrb
|
yeah, it returns -1 EAGAIN
|
[15:54] ptrb
|
i believe it's EAGAIN, at least.
|
[15:56] ptrb
|
the topology is the multithread server example in the intro doc: public tcp XREP endpoint, managed by one thread running zmq_device(ZMQ_QUEUE, ...), forwarding via XREQ to an inproc endpoint, being consumed by worker threads binding to REP
|
[16:02] cremes
|
can you provide a code pastie?
|
[16:03] ptrb
|
it won't be complete, but sure, one sec...
|
[16:03] cremes
|
it doesn't need to be complete... i want to see the code that sets up the socket and calls recv on it
|
[16:04] ptrb
|
the worker thread ultimately responsible for processing the recv, right?
|
[16:04] cremes
|
whatever code is returning -1 EAGAIN
|
[16:05] ptrb
|
http://pastebin.com/rdHn4iX8
|
[16:08] cremes
|
ptrb: in your DEBUG statement, also print out the value of zmq_strerror()
|
[16:08] cremes
|
i need more information to figure this out
|
[16:09] ptrb
|
Operation cannot be accomplished in current state
|
[16:10] cremes
|
ah, then there we have it; with a REP socket you can't call recv again until you have subsequently called send
|
[16:10] cremes
|
it needs that recv/send/recv pattern because it maintains a small internal state machine
|
[16:10] ptrb
|
oh, interesting
|
[16:10] cremes
|
that's the whole point of the REQ/REP socket pattern
|
[16:10] ptrb
|
ok
|
[16:10] ptrb
|
see, this is useful! this should be on a website somewhere :)
|
[16:10] cremes
|
the worker is supposed to respond when it is done, right?
|
[16:10] ptrb
|
well, my thought is that it may optionally respond
|
[16:10] cremes
|
it is for sure
|
[16:10] ptrb
|
but if it *has* to respond, that's fine too
|
[16:11] cremes
|
if you want it to be optional, use XREP sockets
|
[16:11] cremes
|
that kind of socket does not enforce the recv/send/recv pattern
|
[16:11] ptrb
|
I suspect there is more to XREP than simply dropping that enforcement, though
|
[16:13] cremes
|
ptrb: not really; REP sockets are built on top of the XREP socket
|
[16:14] ptrb
|
hmm, interesting
|
[16:14] cremes
|
REP sockets know how to "route" their responses back over multiple hops
|
[16:14] cremes
|
you need to do a little extra work when using an XREP socket to retain that functionality
|
[16:15] cremes
|
this might help a little: http://www.zeromq.org/recipe:new-recipe
|
[16:26] ptrb
|
curiouser and curiouser
|
[20:37] ModusPwnens
|
Hi guys, I have a question on req/rep topology
|
[20:38] ModusPwnens
|
Previously, I have been doing benchmarking with the subscriber/publisher toplogy, but I wanted to see what results I would get with req/rep
|
[20:38] ModusPwnens
|
and I was wondering if there is anything else I need to do besides the obvious change of the socket types and adding in addition send/recv function calls to avoid blocking the code
|
[20:39] ModusPwnens
|
because i noticed after I did those things, rather than sending a message of X bytes, it sends X messages of 1 byte
|
[20:41] ModusPwnens
|
Actually, I lied. It seems to just send a lot of zero byte messages
|
[20:41] cremes
|
ModusPwnens: pastie some code, because it should "just work"
|
[20:51] ModusPwnens
|
Actually, i figured out what it was. However, should rep/req have better or worse performance than sub/pub?
|
[21:07] cremes
|
ModusPwnens: same perf but round-trip latency is higher (no such notion as round-trip latency with pub/sub)
|