[Time] Name | Message |
[02:28] driethr
|
hello. do I have to zmq_setsockopt for ZMQ_IDENTITY before the zmq_connect is done? there is no chance to reassign it, before sending a message?
|
[07:43] sustrik
|
driethr: no, no chance
|
[09:20] omarkj
|
Morning.
|
[09:24] keffo
|
Morning
|
[09:24] keffo
|
What is it Sebastian? I'm arranging matches
|
[09:24] keffo
|
(sorry, felt brittish!) :)
|
[09:58] omarkj
|
Haha, it's fine.
|
[12:43] omarkj
|
Any idea what can cause a message not getting sent to a ZMQ_PUB socket?
|
[12:43] omarkj
|
When I'm publishing a few thousand messages, I sometimes have to retry sending each one up to ~600 times before they're actually published
|
[13:42] drbobbeaty
|
Question about an assertion in the code: If I get the log message: "Assertion failed: *tmpbuf > 0 (zmq_decoder.cpp:60)" -- this is in the 2.0.9 release version.
|
[13:42] drbobbeaty
|
What should I be looking at for the root cause?
|
[13:43] drbobbeaty
|
I'm using a simple "epgm://" PUB/SUB system and this is on the SUB-side.
|
[13:47] drbobbeaty
|
The code seems to indicate that this is indicating that there's nothing in the message to decode, but at the same time, the code has a TODO to indicate that there needs to be work done on the handling of oversized messages.
|
[13:48] drbobbeaty
|
So I'm wondering if there are limits I need to adhere to in message sizes when using the epgm:// transport?
|
[13:53] cremes
|
omarkj: what version of 0mq? what platform (linux, osx, windows)? and do you have a small code example that shows the problem?
|
[13:55] omarkj
|
Using version 2.10 og Linux (Ubuntu to be precise). Publishing from an erlang process.
|
[13:59] cremes
|
omarkj: it's possible there is a bug in 2.1.0 that you have uncovered; it hasn't been released yet
|
[13:59] cremes
|
do you see the same problem with 2.0.10?
|
[13:59] omarkj
|
Ah.
|
[13:59] omarkj
|
I'll have to try it.
|
[13:59] cremes
|
that's the last release of the 2.0 series
|
[14:00] cremes
|
honestly, it's more likely there is a bug in your code; unfortunately, i can't read erlang otherwise i'd lend a hand
|
[14:00] cremes
|
the most likely culprits are...
|
[14:00] cremes
|
1. starting your publisher *before* you start the subscribers; all messages get dropped due to a timing issue
|
[14:00] cremes
|
2. forgetting to call setsockopt(ZM_SUBSCRIBE, topic) and setting a subscriber topic
|
[14:01] cremes
|
3. not sleeping at the end of your publisher's send; the subscriber doesn't have time to fetch messages before the queue is released
|
[14:01] Andreas
|
Hi everybody!
|
[14:02] Andreas
|
I got a question regarding pub/sub. Maybe somebody can help me with that ....
|
[14:02] Andreas
|
Is there any possibilty to get noticed whether a subscriber got disconnected?
|
[14:02] guido_g
|
no
|
[14:03] guido_g
|
you have to do it yourself
|
[14:03] Andreas
|
is something planned for future releases?
|
[14:04] omarkj
|
cremes: I guess I'm not sleeping for the ms after the send..
|
[14:05] cremes
|
omarkj: try that and see what happens... sleep for 10s just for kicks
|
[14:06] omarkj
|
Why is that needed by the way? To flush or something like that?
|
[14:17] cremes
|
omarkj: the publisher would exit before the subscriber could process all of the messages; that's all
|
[14:17] cremes
|
omarkj: read the guide if you haven't yet; it answers pretty much all of the basics: http://zguide.zeromq.org/chapter:all
|
[14:22] sustrik
|
drbobbeaty: are you using multi-part messages?
|
[14:22] omarkj
|
No, I have, just wondering,
|
[14:23] drbobbeaty
|
No, I'm just making a single message and sending it. If it's multi-part, then it's by default as I'm using a very simplistic interface to ZeroMQ.
|
[14:23] sustrik
|
hm, it looks like a bug, can you provide a minimal test case to reproduce it?
|
[14:24] drbobbeaty
|
I'm actually going to try 2.0.10 now as I just noticed that it's been released. If that has the same issue, I'll send something to the mailing list.
|
[14:25] sustrik
|
drbobbeaty: thanks
|
[14:27] sustrik
|
omarkj: with 2.1 the sleep at the end of the program should not be needed
|
[14:27] sustrik
|
zmq_term just blocks until out outbound messages are sent
|
[14:27] omarkj
|
Okay.
|
[14:28] sustrik
|
omarkj: have you ZMQ_HWM option set?
|
[14:29] omarkj
|
sustrik: Yup, it's set at one. Maybe raise that.
|
[14:29] omarkj
|
I don't remember if I tried doing that.
|
[14:29] sustrik
|
that's the problem
|
[14:29] sustrik
|
if you have buffer of size 1
|
[14:29] sustrik
|
it can store 1 message
|
[14:29] pieterh
|
omarkj: yes, you're basically just dropping stuff at the pub side
|
[14:29] omarkj
|
Okay.
|
[14:29] pieterh
|
HWM=1 is for specialized request-reply cases or if you are using SWAP
|
[14:30] omarkj
|
I see, silly me. I'll give it a try.
|
[14:30] pieterh
|
just remove all HWM settings
|
[14:30] omarkj
|
I was having problems with the server crashing badly after some time I did that.
|
[14:30] pieterh
|
then remove IDENTITY settings on the subscribers
|
[14:31] pieterh
|
read the section on durable subscribers in the guide
|
[14:31] pieterh
|
if you create durable subscribers you must set a HWM but something like 10,000 is reasonable depending on message size and rate
|
[14:31] pieterh
|
actually with durable subscribers the server will eventually crash anyhow since there's no concept of ending a durable subscriber
|
[14:32] sustrik
|
even if there's no identity set
|
[14:32] sustrik
|
if the sender is faster than subscribed
|
[14:32] pieterh
|
yes, indeed
|
[14:32] sustrik
|
the buffer would eventually grow out of memory
|
[14:32] pieterh
|
what the guide says is "set HWM in a serious publisher"
|
[14:33] pieterh
|
but not 1 :-)
|
[14:33] omarkj
|
Haha, yes, I must have changed it to one during some bug hunting or something.
|
[15:52] drbobbeaty
|
Question: I have a user that's trying to run/debug a ZeroMQ app I've written in NetBeans 6.7.1. In gdb it runs fine, but in NetBeans it doesn't. I've never used NetBeans and can't really help because it works in the shell and in gdb. Has anyone ever heard of any issues with ZeroMQ and NetBeans? Assuming this guy can run it in gdb...
|
[15:54] DerGuteMoritz
|
AFAIK IDEs like NetBeans tend to mess with PATH
|
[15:54] DerGuteMoritz
|
maybe he needs to change some setting first
|
[15:54] DerGuteMoritz
|
I bet this is on Windows?
|
[16:58] drbobbeaty
|
No, NetBeans on Linux
|
[16:59] drbobbeaty
|
Seems to have a decent path as it's starting to run, and the LD_LIBRARY_PATH is set, but thanks for the idea. We'll have to keep checking things.
|
[19:19] sustrik
|
drbobbeaty: hi
|
[19:19] sustrik
|
still there?
|
[19:19] drbobbeaty
|
Yup
|
[19:19] sustrik
|
i am not sure i understand your use case exactly
|
[19:19] drbobbeaty
|
OK... I'll explain.
|
[19:19] sustrik
|
how many publisher socket are there
|
[19:19] sustrik
|
?
|
[19:20] drbobbeaty
|
At the current time - the network will have probably 270. Not all on one process - maybe 4 to 10 per process.
|
[19:21] sustrik
|
i mean in the test where you are seeing the problem
|
[19:21] drbobbeaty
|
The idea is to distribute the exchange tick data on different multicast addresses to allow the switches to "squeltch" the traffic if it's not subscribed for.
|
[19:21] drbobbeaty
|
In the test, there are four open publishers.
|
[19:21] drbobbeaty
|
Two of which are lines 80 and 81 in the gist.
|
[19:21] sustrik
|
four PUB sockets?
|
[19:21] drbobbeaty
|
Yup.
|
[19:21] drbobbeaty
|
in one process.
|
[19:22] drbobbeaty
|
In the other process, there is one SUB socket with 27 "connections"
|
[19:22] sustrik
|
how many connects/binds on each PUB socket?
|
[19:23] drbobbeaty
|
One socket - One connect. It's using epgm:// so I didn't think I needed a bind() - at least I didn't see it in the examples.
|
[19:23] sustrik
|
sure
|
[19:23] sustrik
|
so each PUB socket conects to a different multicast group, right?
|
[19:24] drbobbeaty
|
Yup - exactly right.
|
[19:24] sustrik
|
so you have 4 PUB sockets and 4 mutlicast groups
|
[19:24] drbobbeaty
|
Yes... in the transmitter process.
|
[19:24] sustrik
|
good
|
[19:25] sustrik
|
now, in the gist i see you connect SUB socket to ~20 multicast groups
|
[19:25] sustrik
|
meaning 16 of them are idle
|
[19:25] sustrik
|
right?
|
[19:25] drbobbeaty
|
Actually, of the 4 PUB sockets in the transmitter - only two are listed in the gist - so In the gist example, 2 of the 27 are expecting traffic.
|
[19:26] sustrik
|
ok
|
[19:26] sustrik
|
there's only one SUB socket, right?
|
[19:26] drbobbeaty
|
Right.
|
[19:26] sustrik
|
now, rach PUB socket transmits 4000 msgs/sec, right?
|
[19:26] sustrik
|
each*
|
[19:26] drbobbeaty
|
approximately, yes.
|
[19:27] sustrik
|
so we have ~16000 msgs/sec on the wire
|
[19:27] sustrik
|
of which the sub socket should retrieve 8000/sec
|
[19:27] sustrik
|
now, what are you seeing?
|
[19:28] sustrik
|
800,000 msgs/sec?
|
[19:28] drbobbeaty
|
To be honest, the PUB sockets aren't all the same at 4000 msgs/sec - two of them (the ones I'm NOT listening to in the gist) are much less... So I'd say we're looking at 8000 msgs/sec I should see.
|
[19:29] sustrik
|
ok
|
[19:29] drbobbeaty
|
With the gist example I'm seeing 200k - 700 k msgs/sec.
|
[19:29] drbobbeaty
|
in my latest tests.
|
[19:29] sustrik
|
that's number of successful recvs on the SUB socket, right?
|
[19:29] sustrik
|
per second
|
[19:29] drbobbeaty
|
Yup.
|
[19:30] drbobbeaty
|
It varies a lot as it's live exchange data.
|
[19:30] sustrik
|
so let's say each "connect" gets all the data sent
|
[19:31] drbobbeaty
|
But when I edit the gist to only connect to the URLs in line 80 and 81, the numbers line up very close.
|
[19:31] sustrik
|
that's 16000 msgs/sec * 27
|
[19:31] sustrik
|
432000 msgs/sec
|
[19:31] sustrik
|
hm
|
[19:31] drbobbeaty
|
OK, I'm with you... Yeah... and I didn't see it being a linear multiplication either.
|
[19:32] sustrik
|
what's the interval for calculation of throughput?
|
[19:32] sustrik
|
one second?
|
[19:32] drbobbeaty
|
Typically 10 sec on the transmitter and about 1 sec on the receiver.
|
[19:32] sustrik
|
ok, so the variation may be caused by small sample interval on the receiver
|
[19:33] drbobbeaty
|
Yeah, easily could be.
|
[19:33] sustrik
|
can you use a larger interval?
|
[19:33] drbobbeaty
|
Yeah.
|
[19:33] sustrik
|
great
|
[19:34] drbobbeaty
|
Increased it to 10 sec on the receiver... still seeing 92k msgs/sec received versus 4k msgs/sec sent.
|
[19:35] sustrik
|
92000, ok
|
[19:35] sustrik
|
92000 / 27 = 3400
|
[19:36] drbobbeaty
|
It's close if you do the division...
|
[19:36] sustrik
|
that's kind of close to the sending rate on a single pub socket
|
[19:36] drbobbeaty
|
Yup. Agreed.
|
[19:36] drbobbeaty
|
Which is why I wondered if I was doing something wrong and publishing the same message on ALL URLs and getting duplicates that way.
|
[19:37] drbobbeaty
|
But I didn't see how in the code, and in the logging of the connections.
|
[19:37] sustrik
|
if you turn the unused publishers of, does it make difference in throughput on the SUB side?
|
[19:37] drbobbeaty
|
If they are unused, they are never turned "on" - the default is to be OFF until needed. So it doesn't effect this.
|
[19:38] drbobbeaty
|
...on the PUB side.
|
[19:38] drbobbeaty
|
On the SUB side, if I turn off the unused ones, the numbers line up very closely.
|
[19:38] sustrik
|
how do you do that?
|
[19:38] sustrik
|
there's some OOB channel to inform publishers to staert/stop?
|
[19:39] drbobbeaty
|
On the PUB side, I look at the message from the exchange - is it a quote, is it a trade, what symbol is it for - I use these to "classify" the message into 1 of the 270 multicast channels.
|
[19:39] drbobbeaty
|
If the PUB channel isn't open, a socket is created, the connection the the correct URL is made, and the message is sent.
|
[19:39] drbobbeaty
|
"On Demand" sockets and connections on the PUB side.
|
[19:40] drbobbeaty
|
The SUB side can't know what's "active" so it had to listen to large "sections" of the multicast space.
|
[19:40] sustrik
|
what i am trying to figure out whether there are 2 sockets publishing or 4
|
[19:40] sustrik
|
in the test scenario
|
[19:41] drbobbeaty
|
The transmitter is publishing on 4. Two of which are in the gist code, and two are not. The two that are in the gist code are the "busy" ones.
|
[19:42] drbobbeaty
|
Essentially, the gist test receiver is listening to a part of what the transmitter is sending. But it's also listening to a lot of "dead channels".
|
[19:42] sustrik
|
ok, but there are 4 sockets pushing data to the wire
|
[19:42] drbobbeaty
|
Yup.
|
[19:42] sustrik
|
thus overall load on the wire is ~16000 msgs/sec
|
[19:42] sustrik
|
good
|
[19:43] sustrik
|
if you stop the two publishers that nodoby listens to
|
[19:43] sustrik
|
does it change the throughput on the receiver?
|
[19:43] drbobbeaty
|
Hmmm... I don't know. I can try that. I'll do that now.
|
[19:49] drbobbeaty
|
Lots of fluctuation in the market now, but it appears that the numbers sent and received are nearly the same as before. This, I believe, is due to the different loads on the 4 multicast channels. The two I'm listening to are Quotes - very high volume. The two I'm not listening to, and have now turned off, are Trades, and very low volume. If I recall correctly, the ration is somewhere in the 200:1 range or so. Meaning there are about 200 QUote messages f
|
[19:49] drbobbeaty
|
every Trade message - roughly.
|
[19:50] drbobbeaty
|
That was supposed to read: "..ratio is somewhere in the 200:1 range..."
|
[19:51] sustrik
|
ok, anyway
|
[19:52] sustrik
|
it looks like the SUB socket is getting the data from each connect even though 25 of them should get nothing
|
[19:52] drbobbeaty
|
That's my theory.
|
[19:52] sustrik
|
it looks like the filtering is broken
|
[19:52] sustrik
|
would it be possible to write a simple test progeam
|
[19:53] sustrik
|
say a publisher that sends 1 message/sec
|
[19:53] sustrik
|
and a subscribed that would connect to two different mcast groups
|
[19:53] sustrik
|
and check whether we ger 1 or 2 msgs/sec on the SUB side?
|
[19:54] sustrik
|
if we'll get 2, the theory is proven
|
[19:56] drbobbeaty
|
I can write that, sure. I'll use the same gist receiver but I'll make a simple transmitter and have it transmit on 1 or 2 of the multicast channels at a regular interval, and we'll see.
|
[19:56] drbobbeaty
|
It'll probably take me until tomorrow morning, but I'll do it and post it to the mailing list with the results.
|
[19:57] drbobbeaty
|
Sounds OK?
|
[19:57] sustrik
|
sure
|
[19:57] sustrik
|
just keep the code as simple as possible
|
[19:57] sustrik
|
while (1) {
|
[19:57] sustrik
|
sleep (1);
|
[19:57] sustrik
|
zmq_send (msg);
|
[19:57] sustrik
|
}
|
[19:57] sustrik
|
something like that
|
[19:58] drbobbeaty
|
Yup, that's what I was planning. Very simple.
|
[19:58] sustrik
|
ack
|
[19:58] sustrik
|
ok, see you
|
[20:21] drbobbeaty
|
sustrik: I've got the second gist done and the results are stunning. The gist is: http://gist.github.com/635015 - it's a very simple transmitter that's sending a message a second on one of the 27 multicast channels.
|
[20:22] sustrik
|
and?
|
[20:22] drbobbeaty
|
When I have the same receiver gist running against this, it returns all kinds of numbers more than 1/sec. But when I comment out all but the one that's actually active, it returns 1/sec - just like you'd expect.
|
[20:22] sustrik
|
good
|
[20:22] drbobbeaty
|
It really looks like something isn't working on the filtering.
|
[20:22] sustrik
|
so we have a reproducible test case
|
[20:23] drbobbeaty
|
You can build and run these gists as they have no dependencies other than ZeroMQ.
|
[20:23] drbobbeaty
|
Yup, we do.
|
[20:23] sustrik
|
can you report the problem on the mailing list, point to the gist etc.?
|
[20:23] drbobbeaty
|
You bet. Be glad to.
|
[20:23] sustrik
|
great, thanks!
|