[Time] Name | Message |
[00:01] mikko
|
did it work?
|
[00:02] dan
|
yeah
|
[00:02] dan
|
im trying to figure out how to "share it" lol
|
[00:02] mikko
|
just the url
|
[00:02] dan
|
git@gist.github.com:2f8a0d953fa7c1acc7d6.git
|
[00:03] dan
|
https://gist.github.com/2f8a0d953fa7c1acc7d6
|
[00:03] mikko
|
thanks
|
[00:03] mikko
|
let me take a look
|
[00:03] mikko
|
thats not the ipc one?
|
[00:03] dan
|
The Connections.cpp and Connections.py files define "Publisher" and "Subscriber" classes
|
[00:03] mikko
|
what did you use as URI for the ipc?
|
[00:04] dan
|
there is an intermediate service on tcp that tells teh publisher and subscribers where to connect
|
[00:05] dan
|
ah crap i forgot the file for the intermediate name look-up...
|
[00:05] dan
|
the ipc names are just ipc://something.ipc
|
[00:05] dan
|
the publisher gets told to bind to both tcp and ipc ports
|
[00:06] mikko
|
hmm
|
[00:06] dan
|
the subscriber gets told to bind to an ipc (if it is on the same machine as the publisher) or tcp if it is remote
|
[00:06] mikko
|
are you running them on same directory?
|
[00:06] dan
|
... no
|
[00:06] mikko
|
ipc:///tmp/test.ipc
|
[00:06] mikko
|
that would be an absolute path to /tmp/test.ipc
|
[00:07] mikko
|
ipc://something.ipc would be relative to the current working directory
|
[00:07] mikko
|
notice the amount of slashes
|
[00:07] dan
|
ah ok that would make sense
|
[00:07] dan
|
let me try that
|
[00:09] dan
|
awesome
|
[00:09] dan
|
that worked
|
[00:09] dan
|
thanks
|
[00:10] mikko
|
no problem
|
[00:11] dan
|
one other question i just thought of - what is the correct way to get a python subscriber to exit cleanly if it is stuck waiting on a recv()?
|
[00:11] dan
|
(like in SubTest.py) ?
|
[00:13] mikko
|
ctrl + c doesnt work?
|
[00:13] dan
|
no - it seems to take kill -9
|
[00:14] mikko
|
i don't really know about python that much
|
[00:14] mikko
|
i think python might be eating the signal
|
[00:14] dan
|
lol neither do i
|
[00:15] dan
|
well thats why i tried putting the signal handler in
|
[00:15] dan
|
but it doesn't seem to have any effect
|
[00:15] mikko
|
one option would be to use zmq poll
|
[00:16] mikko
|
use a timeout on socket and after timeout check a flag or something
|
[00:16] mikko
|
which you set in signal handler
|
[00:16] dan
|
that sounds reasonable
|
[00:16] dan
|
i have not used the poll feature yet
|
[00:17] dan
|
at any rate - thanks for your time and help!
|
[00:19] mikko
|
no probs
|
[00:19] mikko
|
its usually more lively in here around day time (eu time)
|
[00:19] dan
|
ah - are you a contributer?
|
[04:35] rhino
|
Hi, has anyone loaded jzmq into eclipse before?
|
[08:29] yrashk
|
I am confused again.. which erlzmq is supposed to be better? I have currently packaged official one (zeromq/erlzmq) into erlagner.org index.
|
[08:32] sustrik
|
there are several of them?
|
[08:32] jugg
|
imo http://github.com/csrl/erlzmq is more feature complete and stable. The "official" version is useful for high volume 'sub' sockets, and playground use. But is pretty buggy otherwise.
|
[08:34] jugg
|
disclaimer, 'csrl' is me.
|
[08:34] sustrik
|
anything tah zeromq/erlzmq has and csrl/erlzmq hasn't?
|
[08:34] sustrik
|
i know
|
[08:34] yrashk
|
sustrik: I am trying to understand what's the difference between csrl and zeromq
|
[08:35] sustrik
|
no idea
|
[08:35] sustrik
|
jugg should be able to explain
|
[08:35] yrashk
|
jugg: then I just sent you a message on github :)
|
[08:35] jugg
|
zeromq/erlzmq will actively push incoming data to the erlang process. csrl/erlzmq won't.
|
[08:35] yrashk
|
would you mind checking it out?
|
[08:36] sustrik
|
yrashk: what message? i haven't got anything yet.
|
[08:36] yrashk
|
sustrik: no, I have sent it to hugg
|
[08:36] yrashk
|
jugg*
|
[08:36] yrashk
|
I wrote " jugg: then I just sent you a message on github :)"
|
[08:37] sustrik
|
oops
|
[08:37] sustrik
|
:)
|
[08:37] yrashk
|
:]
|
[08:37] sustrik
|
jugg: i think there used to be both options, no?
|
[08:37] jugg
|
yrashk, sure, I'll do that.
|
[08:37] sustrik
|
iirc, something like "passive" flag
|
[08:37] yrashk
|
sustrik: I think zeromq/erlzmq has active option
|
[08:37] jugg
|
sustrik, yes, my version doesn't provide that option.
|
[08:38] jugg
|
'active' flag actually...
|
[08:38] sustrik
|
aha
|
[08:38] sustrik
|
ok, any chance to merge the two implementations?
|
[08:38] yrashk
|
jugg: please let me know when you'll merge that patch in, I'll package your fork right after that
|
[08:38] yrashk
|
sustrik: that would be nice.. I guess
|
[08:38] jugg
|
yrashk, ok
|
[08:38] yrashk
|
less to manage for me :]
|
[08:39] sustrik
|
yrashk: are you an erlzmq maintainer?
|
[08:39] jugg
|
sustrik, the 'active' flag is only useful for subscription sockets. If used on other sockets it is the source of problems.
|
[08:39] yrashk
|
sustrik: erlzmq.agner maintainer, it's just a "spec" package
|
[08:40] sustrik
|
i see
|
[08:40] yrashk
|
I am the guy behing erlagner.org operations, we're finally doing proper packaging system for erlang
|
[08:41] sustrik
|
ah, i've heard about it
|
[08:41] sustrik
|
not or erlang user myself though
|
[08:41] jugg
|
saleyn will not merge my changes unless I implement 'active' option support and a couple of other minor things. But I don't have the time to do it correctly at this point, and I'm not interested in merging the implementation as is, given the issues surrounding it.
|
[08:42] sustrik
|
jugg: iirc active was introduced to allow erlzmq to work in non-lock-step fashion
|
[08:42] yrashk
|
this is why I'd prefer to package both official and jugg's versions
|
[08:42] jugg
|
yes, my implementation supports polling, which negates the need for it in that regards.
|
[08:42] sustrik
|
well, no way out at the moment
|
[08:43] sustrik
|
you may also consider adding new binding page to zeromq.org
|
[08:43] sustrik
|
there's a precedent:
|
[08:43] sustrik
|
there are 2 ruby bindings
|
[08:43] jugg
|
saleyn sites possible performance implications, however no actual testing has been done afaik.
|
[08:44] yrashk
|
how much is the slowdown on raw 0mq vs erlzmq btw?
|
[08:44] yrashk
|
on either binding
|
[08:44] yrashk
|
or, let me rephrase, how severe*
|
[08:44] sustrik
|
nobody really tested it, afaik
|
[08:44] yrashk
|
and erlzmq vs raw 0mq, to be more correct :]
|
[08:45] yrashk
|
heh
|
[08:45] yrashk
|
too bad
|
[08:45] yrashk
|
I am actually actively considering using erlzmq in one of the oncoming projects
|
[08:45] sustrik
|
testing it should be easy
|
[08:45] sustrik
|
there are C testing in /perf subdirectory
|
[08:46] sustrik
|
as for erlzmq, i don't know, but most bindings have /perf subdir as well
|
[08:46] jugg
|
A different comparison: I have two interfaces to my application, one native erlang, and another using erlzmq. zeromq doesn't significantly add any overhead.
|
[08:46] sustrik
|
containing exactly the same tests
|
[08:46] yrashk
|
jugg: so it's just comparable?
|
[08:47] yrashk
|
I hoped it will be faster, lol :)
|
[08:47] jugg
|
native erlang vs erlzmq, yes comparable.
|
[08:47] yrashk
|
although of course it is unlikely to be faster
|
[08:47] sustrik
|
it's native erlang vs. 0mq; what you asked about is raw 0MQ from C vs. erlzmq
|
[08:47] yrashk
|
yeah I understand
|
[08:48] yrashk
|
I suppose raw 0mq is much, much faster
|
[08:48] sustrik
|
jugg: are there perf tests in erlzmq?
|
[08:48] yrashk
|
sustrik: nope
|
[08:49] sustrik
|
than you may consider writing them, it's few lines of code
|
[08:49] jugg
|
no
|
[08:50] yrashk
|
sustrik: like porting https://github.com/zeromq/pyzmq/tree/master/perf ?
|
[08:50] mikko
|
good morning
|
[08:50] sustrik
|
morning
|
[08:50] sustrik
|
yrashk: these seem unnecessary complex
|
[08:51] yrashk
|
sustrik: do you have any other examples?
|
[08:51] sustrik
|
what about this:
|
[08:51] sustrik
|
https://github.com/zeromq/rbzmq/tree/master/perf
|
[08:52] yrashk
|
yup, quite simple
|
[09:02] yrashk
|
porting local_lat
|
[09:03] yrashk
|
hmmm erlzmq has no support for contexts
|
[09:03] yrashk
|
:)
|
[09:10] sustrik
|
doesn't matter, there a global context in the lib, i assume
|
[09:11] sustrik
|
btw, setting HWM in local_lat is superfluous
|
[09:11] sustrik
|
no idea how it got into ruby perf suite
|
[09:14] yrashk
|
lol
|
[09:14] yrashk
|
anyway I actually switched to porting _thr tests
|
[09:14] sustrik
|
ack
|
[09:14] yrashk
|
local_thr is done
|
[09:14] sustrik
|
lol
|
[09:14] yrashk
|
:D
|
[09:15] sustrik
|
you are fast
|
[09:15] yrashk
|
realtime progress, eh?
|
[09:18] yrashk
|
what size & count should I test it on?
|
[09:21] yrashk
|
I think official erlzmq is somewhat broken
|
[09:22] jugg
|
yrashk, pushed.
|
[09:22] yrashk
|
jugg: cool!
|
[09:23] yrashk
|
zeromq/erlzmq setsockopt doesn't work with the type of sockets socket/* create :S
|
[09:26] yrashk
|
aaand {error,einval} :-\
|
[09:29] yrashk
|
ok, I'll try to package jugg's fork now
|
[09:32] yrashk
|
done
|
[09:32] yrashk
|
it looks much cleaner
|
[09:35] CIA-21
|
zeromq2: 03Martin Sustrik 07master * r28f3e87 10/ (5 files):
|
[09:35] CIA-21
|
zeromq2: Add delay before reconnecting
|
[09:35] CIA-21
|
zeromq2: So far ZMQ_RECONNECT_IVL delay was used only when TCP connect
|
[09:35] CIA-21
|
zeromq2: failed. Now it is used even if connect succeeds and the peer
|
[09:35] CIA-21
|
zeromq2: closes the connection afterwards.
|
[09:35] CIA-21
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/i9iO9x
|
[09:36] sustrik
|
yrashk: any perf results?
|
[09:36] sustrik
|
use something like 1M messages, each 1B long
|
[09:39] yrashk
|
sustrik: almost there
|
[09:39] yrashk
|
sustrik: fixing last bugs
|
[09:39] jugg
|
yrashk, for those perf tests, if you run local/remote on the same node under the same zmq context, you'll be sharing a single port which could cause a bottleneck. Either run them on separate nodes, or create two contexts, one for local, one for remote. (Using a port per socket is a TODO item).
|
[09:40] yrashk
|
it is late in the night so I am not sure if my math is right
|
[09:40] yrashk
|
jugg: testing on separate vms
|
[09:40] jugg
|
k
|
[09:40] yrashk
|
Throughput = MessageCount / Elapsed
|
[09:40] yrashk
|
Elapsed is in microseconds
|
[09:41] yrashk
|
I guess I need to multiply by 1000
|
[09:42] yrashk
|
1M messages taking some time...
|
[09:42] yrashk
|
quite some time
|
[09:42] yrashk
|
sustrik: how long 1M of 1B is supposed to be with raw 0mq?
|
[09:42] sustrik
|
fraction of a second
|
[09:43] yrashk
|
ha
|
[09:43] yrashk
|
well erlzmq is doomed then
|
[09:43] jugg
|
can you post the code?
|
[09:44] sustrik
|
0.3 sec on my box
|
[09:44] yrashk
|
https://gist.github.com/5c3005692b594c318897
|
[09:44] yrashk
|
sustrik: with raw 0mq right?
|
[09:45] sustrik
|
yes
|
[09:46] yrashk
|
well it took about half a minute here
|
[09:46] yrashk
|
hehe
|
[09:46] yrashk
|
for tcp transport
|
[09:46] yrashk
|
mean throughput: 30524.908462667627 [msg/s]
|
[09:47] yrashk
|
(if my math is correct)
|
[09:47] yrashk
|
(this is csrl's fork)
|
[09:52] yrashk
|
official one, I can't even test yet
|
[09:52] yrashk
|
getting einval on recv
|
[09:53] yrashk
|
sustrik: jugg ^^^
|
[09:54] sustrik
|
strange
|
[09:54] yrashk
|
this performance isn't impressive at all :-(
|
[09:55] sustrik
|
maybe because of "active"?
|
[09:56] sustrik
|
in any case, don't throw the tests away
|
[09:56] sustrik
|
let's merge them to erlzmq
|
[09:56] yrashk
|
I am trying to set active to false explicitly
|
[09:57] yrashk
|
ok it worked
|
[09:57] yrashk
|
I think it was enabled by default
|
[09:58] yrashk
|
I suspect that official w/o active is even slower
|
[09:58] yrashk
|
no
|
[09:58] yrashk
|
about the same
|
[09:58] yrashk
|
mean throughput: 32142.1195960057 [msg/s]
|
[09:58] sustrik
|
hm
|
[09:59] yrashk
|
creating another version to test with active on
|
[10:01] yrashk
|
active was even worse
|
[10:01] yrashk
|
mean throughput: 28188.449027560524 [msg/s]
|
[10:01] mikko
|
sustrik: whats the process in updating openpgm?
|
[10:01] yrashk
|
and
|
[10:01] yrashk
|
mean throughput: 26443.701834780386 [msg/s]
|
[10:01] sustrik
|
mikko: i don't know, mato did it
|
[10:02] sustrik
|
i suppose it's just replacing the archive
|
[10:02] mikko
|
sustrik: i think we are soon in a point where we can invoke openpgm autotools build
|
[10:02] mikko
|
and remove large amount of duplicated work
|
[10:02] sustrik
|
wow
|
[10:02] mikko
|
everything still works as they did before (bundled openpgm)
|
[10:02] sustrik
|
congrats
|
[10:03] mikko
|
but it just runs openpgm 'configure' from zeromq 'configure'
|
[10:03] sustrik
|
yrashk: time to profile erlzmq :)
|
[10:03] mikko
|
also, all compiler choices etc flow down to openpgm build
|
[10:04] pieterh
|
sustrik, fix to mailbox did not work, I've sent details to the list
|
[10:04] sustrik
|
nice
|
[10:04] sustrik
|
pieterh_: yes, it's a complex problem
|
[10:04] mikko
|
should i add gcov to zeromq as well?
|
[10:04] sustrik
|
see the original email in the thread
|
[10:04] mikko
|
so that we can visually see what's tested and what's not?
|
[10:04] yrashk
|
sustrik, jugg: ok so I have patches with perf teststo merge in now, interested?
|
[10:04] pieterh
|
sustrik: IMO there is a bug in the mailbox handling of reconnect commands
|
[10:05] pieterh
|
the mailbox grows consistently for each reconnect
|
[10:05] jugg
|
yrashk, sure
|
[10:05] pieterh
|
doesn't matter how fast or slow you do them
|
[10:05] sustrik
|
mikko: why not
|
[10:05] yrashk
|
jugg: sent performance tests pull req
|
[10:05] sustrik
|
pieterh_: it's because the application thread does not process the commands (it's sleeping)
|
[10:06] pieterh
|
sustrik: ack, any way to discard duplicates?
|
[10:06] yrashk
|
I am a very sad panda now
|
[10:06] sustrik
|
not really, they are in the middle of the socketpair buffer
|
[10:06] pieterh
|
then kill the parent socket if it doesn't behave
|
[10:06] sustrik
|
rather the reconnection logic should be rethought
|
[10:07] sustrik
|
that's an option
|
[10:07] sustrik
|
EBROKEN
|
[10:07] yrashk
|
sustrik: https://github.com/zeromq/erlzmq/pull/16
|
[10:07] pieterh
|
yes, exactly
|
[10:07] sustrik
|
yrashk: thx
|
[10:08] pieterh
|
it's not a 0MQ error, it's a caller error, so it should kill whatever's misbehaving and return an error
|
[10:08] yrashk
|
still, 30K 1-byte messages per second...
|
[10:08] pieterh
|
not assert, of course
|
[10:08] yrashk
|
:S
|
[10:08] pieterh
|
yrashk, you running AMQP?
|
[10:08] pieterh
|
just kidding, there was a 30K figure from someone impressed with AMQP, on Twitter...
|
[10:10] yrashk
|
pieterh_: no, erlzmq actually
|
[10:10] yrashk
|
i am not impressed
|
[10:10] yrashk
|
rather opposite
|
[10:11] pieterh
|
erlang does some heavy locking, I've been told
|
[10:11] yrashk
|
this is a very vague statement
|
[10:11] pieterh
|
indeed
|
[10:12] yrashk
|
I can't even reply to it
|
[10:12] pieterh
|
i mean, it's spending its time somewhere and that is not 0MQ and probably not erlzmq as such either
|
[10:13] mikko
|
http://www.erlang-factory.com/upload/presentations/303/rickard-green-euc2010.pdf
|
[10:13] mikko
|
erlang locking explained
|
[10:14] yrashk
|
I assume everyone knows what erlang rwlocks are
|
[10:14] mikko
|
i didn't before i read that
|
[10:15] pieterh
|
yrashk, few people here have every used erlang
|
[10:15] pieterh
|
*ever used
|
[10:16] pieterh
|
mikko, interesting stuff but does anyone have figures for the actual overhead of erlang rwlocks?
|
[10:18] pieterh
|
sustrik: any best practice on turning an identity blob_t into a printable string for error reporting?
|
[10:18] yrashk
|
what really upsets me is that erlzmq is maintained by saleyn
|
[10:19] yrashk
|
if he can't make it fast, then who can? :-\
|
[10:19] pieterh
|
maybe 30k/sec is fast, it's always relative...
|
[10:19] yrashk
|
I don't think it's fast
|
[10:19] yrashk
|
if raw 0mq can do it like 100 times faster
|
[10:19] pieterh
|
Have you measured the raw speed of two erlang tasks talking to each other?
|
[10:19] yrashk
|
nope
|
[10:20] yrashk
|
I think jugg had
|
[10:20] yrashk
|
did you?
|
[10:20] pieterh
|
So Erlang is afaik not meant to be a high performance language but a language for reliable distributed systems
|
[10:21] pieterh
|
I suspect (with only third-hand information) that you'll find it to be around 30K/second
|
[10:21] pieterh
|
Which is actually fine for many apps
|
[10:21] pieterh
|
Especially if they process only one stream of data
|
[10:22] yrashk
|
"I see therefore that on two different Erlang nodes, this benchmark gives a result of an average of 700 thousands ping/pong messages per minute"
|
[10:22] yrashk
|
just googled this
|
[10:22] yrashk
|
well it looks like we're doing more
|
[10:22] pieterh
|
hmm, 20x faster...
|
[10:22] yrashk
|
we have million per 30 seconds
|
[10:22] pieterh
|
per *minute*...
|
[10:23] yrashk
|
it's ping/pong tho
|
[10:23] pieterh
|
that's ping pongs
|
[10:23] pieterh
|
so double it
|
[10:23] yrashk
|
1,4mln
|
[10:23] pieterh
|
1,400,000/minute
|
[10:23] pieterh
|
23k/second
|
[10:23] yrashk
|
sustrik: in your tests, I assume you run over tcp?
|
[10:24] yrashk
|
"I see therefore that within the same Erlang node, this benchmark gives a result of an average of 5.3 million ping/pong messages per minute, the same that we had without the queuing mechanism."
|
[10:24] pieterh
|
which is pretty much equal to 30k/sec allowing for system differences
|
[10:24] yrashk
|
sustrik: also, different processes?
|
[10:25] pieterh
|
so 160k/sec within same node
|
[10:26] yrashk
|
I just want to verify raw 0mq can do million messages betwen two diff processes over tcp on macosx in 0.3 secs
|
[10:26] yrashk
|
whether raw 0mq*
|
[10:27] yrashk
|
does this sound about right?
|
[10:27] yrashk
|
obviously there's an overhead in any higher level language esp with VM likes beam
|
[10:27] pieterh
|
yrashk, not ping pong, just send?
|
[10:27] yrashk
|
just sends
|
[10:27] yrashk
|
and recvs
|
[10:27] pieterh
|
well, it will batch messages
|
[10:27] yrashk
|
but I want to make sure it is 100 times faster
|
[10:28] pieterh
|
so if you use 1 byte messages you will get a throughput of several million per second at least
|
[10:28] pieterh
|
depends a lot on your material here
|
[10:28] yrashk
|
I guess this is what is causing a problem in erlzmq
|
[10:28] pieterh
|
multicore = multifaster
|
[10:30] yrashk
|
does anybody have any other binding available with perf tests available?
|
[10:30] yrashk
|
like rbzmq?
|
[10:31] yrashk
|
I just don't want to go throguh system-wide installation of zeromq and such
|
[10:31] yrashk
|
I copied my test from ruby's remote/local_thr.rb
|
[10:33] mikko
|
do you want bindings tested or is local_thr/remote_thr from zeromq ok?
|
[10:34] yrashk
|
ideally I want both :D
|
[10:34] yrashk
|
I can do the zeromq thing myself I guess
|
[10:38] yrashk
|
mean throughput: 2315163 [msg/s]
|
[10:38] pieterh
|
yrashk, that's rbzmq?
|
[10:39] mikko
|
mean throughput: 4997086 [msg/s]
|
[10:39] mikko
|
over lo using tcp socket on OS X
|
[10:39] yrashk
|
pieterh_: that's raw 0mq
|
[10:39] yrashk
|
win 10
|
[10:39] yrashk
|
oops
|
[10:39] mikko
|
mean throughput: 5345358 [msg/s]
|
[10:39] mikko
|
over en0
|
[10:41] yrashk
|
this is all raw 0mq right?
|
[10:41] mikko
|
yes
|
[10:41] yrashk
|
anybody with rbzmq or pyzmq handy?
|
[10:42] mikko
|
i do
|
[10:42] mikko
|
i build them all :)
|
[10:42] mikko
|
sec
|
[10:42] yrashk
|
can you please run the same test? 1b msgs 1mln times
|
[10:44] mikko
|
yrashk: let me figure out how
|
[10:45] yrashk
|
mikko: in rbzmq those are exactly the same tests
|
[10:45] yrashk
|
perf/local_thr.rb perf/remote_thr.rb
|
[10:46] mikko
|
ok
|
[10:47] mikko
|
i think im running it now
|
[10:50] mikko
|
takes very long time with ruby
|
[10:50] yrashk
|
mikko: lets wait until it's done
|
[10:50] yrashk
|
did you use tcp://127.0.0.1:port 1 1000000
|
[10:50] yrashk
|
or something?
|
[10:51] mikko
|
i started with 100000000 or so
|
[10:51] mikko
|
now testing quickly with 10 that it actually works
|
[10:51] yrashk
|
well I want the same test
|
[10:51] yrashk
|
1 byte msgs, 1mln messages
|
[10:51] mikko
|
https://gist.github.com/62772ec66fc1a46a3372
|
[10:51] mikko
|
this is on linux
|
[10:51] yrashk
|
that's 10mlns but thanks anyway
|
[10:52] mikko
|
and ruby
|
[10:52] mikko
|
http://build.valokuva.org/
|
[10:52] mikko
|
i got those available
|
[10:57] jugg
|
porting yrashk's tests to do raw erlang message passing between two nodes yields 59645.1127719094 [msg/s].
|
[10:58] jugg
|
to clarify that is without zeromq
|
[10:59] mikko
|
ok
|
[11:00] yrashk
|
jugg: so twice as fast
|
[11:00] jugg
|
using zmq (tcp lo) those tests yield 21378.701789023213 [msg/s] on my system.
|
[11:01] jugg
|
erlzmq that is...
|
[11:01] jugg
|
anyway, I'm off. yrashk, I'll look at your pull request later.
|
[11:01] yrashk
|
well my pull req == git
|
[11:01] yrashk
|
gist
|
[11:01] mikko
|
yrashk: not many things on your blog
|
[11:02] mikko
|
in*
|
[11:02] yrashk
|
mikko: it's empty, I know
|
[11:02] yrashk
|
I keep myself busy
|
[11:02] mikko
|
is there more locking involved accessing zeromq sockets from erlang?
|
[11:02] mikko
|
like explicit locking?
|
[11:22] sustrik
|
re
|
[11:24] sustrik
|
pieterh_: no standard way to convert blob into string
|
[11:25] sustrik
|
yrashk: i've used TCP
|
[11:25] sustrik
|
yrashk: you can investigate further
|
[11:26] sustrik
|
by running C local_thr vs. Erlang remote_thr
|
[11:26] sustrik
|
and vice versa
|
[11:26] sustrik
|
that way you'll find out whether the bottleneck on the sender or receiver side
|
[11:32] yrashk
|
sustrik: I think I have a vague idea on how to make erlzmq faster
|
[11:32] yrashk
|
will require a lot of rewriting, though
|
[11:33] sustrik
|
do you have any idea where the bottleneck is?
|
[11:33] yrashk
|
port driver communication
|
[11:33] yrashk
|
not confirmed, just an idea
|
[11:33] sustrik
|
it's good to measure it first
|
[11:33] sustrik
|
otherwise you may end with no perf improvement
|
[11:33] yrashk
|
hard to measure this, though
|
[11:33] yrashk
|
but possible
|
[11:34] sustrik
|
anyway, try the c/erlang perf tests
|
[11:34] yrashk
|
c/erlang?
|
[11:34] sustrik
|
it may turn out that the bottleneck is only on one side
|
[11:34] sustrik
|
<sustrik> yrashk: you can investigate further
|
[11:34] sustrik
|
<sustrik> by running C local_thr vs. Erlang remote_thr
|
[11:34] sustrik
|
<sustrik> and vice versa
|
[11:34] yrashk
|
oh
|
[11:34] yrashk
|
right
|
[11:34] yrashk
|
missed that
|
[11:35] yrashk
|
good idea actually
|
[11:36] yrashk
|
mean throughput: 30606 [msg/s]
|
[11:36] yrashk
|
this is C local_thr
|
[11:36] yrashk
|
mean throughput: 67086.58395785352 [msg/s]
|
[11:37] yrashk
|
this is Erlang local_thr
|
[11:37] yrashk
|
so partially it is the way erlzmq's send works
|
[11:37] yrashk
|
sustrik: ^^
|
[11:38] sustrik
|
both are low
|
[11:38] yrashk
|
I guess no batching happening with erlzmq
|
[11:38] sustrik
|
so there's some common problem
|
[11:38] yrashk
|
yes they are
|
[11:38] sustrik
|
both on send and recv side
|
[11:38] yrashk
|
but receiving got twice as fast
|
[11:38] sustrik
|
yup
|
[11:39] sustrik
|
however, don't focus on that, the primary goal is to speed the whole beast ~50x
|
[11:39] yrashk
|
yeah 50x will be enough
|
[11:39] yrashk
|
that will be like about 1/2 of raw perf
|
[11:40] yrashk
|
which is acceptable
|
[11:40] sustrik
|
afterwards, you can check why recv is 2x faster
|
[11:40] sustrik
|
ack
|
[11:40] yrashk
|
well I am blaming erlzmq's model (driver) now
|
[11:40] sustrik
|
quite possible
|
[11:40] yrashk
|
driver communication isn't very fast as far as I understand
|
[11:40] yrashk
|
I can try converting this into a NIF
|
[11:40] sustrik
|
NIF?
|
[11:40] yrashk
|
which may or may not speed it up
|
[11:40] yrashk
|
Native Implemented Function
|
[11:41] yrashk
|
much less overhead
|
[11:41] yrashk
|
port communication sends data and parses it
|
[11:41] yrashk
|
which isn't fast
|
[11:42] yrashk
|
the only problem is that until some time in future, NIFs are blocking their scheduler. There is a way around that, though -- I wrote erlv8 so I know how to deal with tis
|
[11:42] yrashk
|
this*
|
[11:44] sustrik
|
i see
|
[11:44] yrashk
|
if only I had an unlimited supply of time
|
[11:45] yrashk
|
although I still may try to rewrite erlzmq
|
[11:45] yrashk
|
but that will be almost a complete rewrite
|
[11:45] yrashk
|
:-(
|
[11:46] yrashk
|
writing NIFs is not much fun, really
|
[11:47] yrashk
|
so are the drivers, though :)
|
[11:48] yrashk
|
sustrik: how does the batching work? is it time-based? like batch X messages within T?
|
[11:53] sustrik
|
it's triggered by the network stack
|
[11:53] sustrik
|
so, when the network stack notifies the user space that there's free space in tx buffer
|
[11:53] sustrik
|
0mq takes as much data as is available atm
|
[11:54] sustrik
|
and tries to push it into the kernel in a single go
|
[11:54] yrashk
|
I ee
|
[11:54] yrashk
|
I see
|
[11:54] yrashk
|
well at this point I'd really blame the port communication
|
[11:54] sustrik
|
well, maybe
|
[11:55] sustrik
|
if you haven't time to play with it
|
[11:55] yrashk
|
it is an encode -> pass -> decode -> work -> encode -> decode routine
|
[11:55] sustrik
|
at least write an email to the ML
|
[11:55] yrashk
|
it is an encode -> pass -> decode -> work -> encode -> pass -> decode routine *
|
[11:55] sustrik
|
doesn't seem very efficient
|
[11:56] yrashk
|
those encodings/decodings aren't super slow by themselves, probably, but still, everything adds up
|
[11:56] yrashk
|
NIFs work nearly directly with the data
|
[11:56] sustrik
|
ack
|
[11:56] sustrik
|
yes, that's how other bindings are done
|
[11:56] sustrik
|
the problem with erlang is its threading model
|
[11:56] sustrik
|
afaiu blocking in NIF could get the whole VM to a standstill
|
[11:56] yrashk
|
yes
|
[11:57] yrashk
|
but there are two points:
|
[11:57] yrashk
|
1) this is going to change http://erlang-factory.com/conference/SFBay2011/speakers/RickardGreen
|
[11:58] yrashk
|
2) there are some workarounds (basically, two-thread model). I used it in erlv8. Incidentially, I ended up using 0mq inside erlv8's NIF
|
[11:58] sustrik
|
:)
|
[11:59] yrashk
|
I will be at that conference (I am a speaker there, too) so I will have a chance to attend that talk
|
[11:59] yrashk
|
(in pt. 1)
|
[11:59] sustrik
|
i see
|
[12:00] sustrik
|
are you from SF?
|
[12:00] yrashk
|
nope
|
[12:00] yrashk
|
http://erlang-factory.com/conference/SFBay2011/speakers/YuriiRashkovskii
|
[12:00] yrashk
|
Vancouver, BC
|
[12:01] sustrik
|
aha, just that i'm visiting SF shortly
|
[12:02] yrashk
|
oh
|
[12:02] sustrik
|
but before the conference
|
[12:02] sustrik
|
no overlap there
|
[12:02] yrashk
|
too bad
|
[12:02] yrashk
|
I'd love to share a beer or something
|
[12:02] yrashk
|
open source developers don't get much face-to-face time
|
[12:03] yrashk
|
and it's bad
|
[12:03] pieterh
|
yeah, 5,000 open source developers at FOSDEM earlier in Feb, here in Brussels
|
[12:03] yrashk
|
well I didn't say they don't get it at all
|
[12:03] yrashk
|
just not much
|
[12:03] yrashk
|
:D
|
[12:03] pieterh
|
I mean, it's only once a year
|
[12:04] yrashk
|
I see
|
[12:04] yrashk
|
it would be cool to have a rolling set of unconferences happening in every "hub" city
|
[12:04] yrashk
|
well, they do happen
|
[12:04] yrashk
|
but they are fairly limited
|
[12:05] pieterh
|
sustrik: why not do a 0MQ meetup while you're there...
|
[12:06] sustrik
|
yes, i'll do that
|
[12:06] yrashk
|
record some video! :)
|
[12:06] yrashk
|
if there are going to be any presentations
|
[12:06] sustrik
|
nope, just people drinking beer :)
|
[12:06] yrashk
|
sigh
|
[12:06] yrashk
|
:]
|
[12:07] pieterh
|
yrashk, there's a video of me (I think) praising Erlang at FOSDEM
|
[12:07] pieterh
|
that's "praising Erlang, I think" not "a video of me, I think"...
|
[12:07] yrashk
|
pieterh_: "of me (I think)"
|
[12:07] yrashk
|
were you drunk? :)
|
[12:07] pieterh
|
one is never totally sure
|
[12:07] pieterh
|
FOSDEM is fairly beer intensive
|
[12:08] yrashk
|
that comes withour saying
|
[12:08] yrashk
|
although I've never been to one
|
[12:08] yrashk
|
but just about any dev meetup implies beer
|
[12:08] pieterh
|
yeah, but in Belgium that implies "how strong can we get the beer, and how cheaply can we sell it"
|
[12:08] yrashk
|
too bad I am a little sensitive to alcohol since like 2005 so I don't drink much
|
[12:09] pieterh
|
me too, sadly
|
[12:09] yrashk
|
esp beer :-(
|
[12:09] yrashk
|
weak stomach
|
[12:09] pieterh
|
anyhow fwiw the video is here: // If the session already has an engine attached, destroy new one.
|
[12:09] pieterh
|
// Note new engine is not plugged in yet, we don't have to unplug it.
|
[12:09] pieterh
|
if (engine) {
|
[12:09] pieterh
|
delete engine_;
|
[12:09] pieterh
|
puts ("HEY!");
|
[12:09] pieterh
|
return;
|
[12:09] pieterh
|
}
|
[12:09] yrashk
|
that's not the video
|
[12:09] yrashk
|
:D
|
[12:09] pieterh
|
aw, wrong paste
|
[12:10] pieterh
|
http://www.youtube.com/watch?v=CCBYzKfmQ4U
|
[12:11] yrashk
|
nice accent
|
[12:15] pieterh
|
What's erlv8? I couldn't find that on google
|
[12:15] yrashk
|
part of http://beamjs.org/
|
[12:16] yrashk
|
tightly integrated V8 for Erlang
|
[12:16] yrashk
|
heeey
|
[12:16] yrashk
|
Erlang is not a "weird language"
|
[12:16] yrashk
|
;)
|
[12:17] pieterh
|
Well, it's sooo relative ... :-)
|
[12:17] yrashk
|
it's not weird for me
|
[12:17] yrashk
|
:D
|
[12:18] yrashk
|
sayign language is weird is not exactly "praising" :-P
|
[12:18] yrashk
|
;)
|
[12:18] pieterh
|
that's what my crazy uncle says when he eats raw spinach
|
[12:19] kristsk
|
why is he eating it raw?
|
[12:20] pieterh
|
no-one knows, that's why we say he's weird, but he says "it's not weird for me"
|
[12:20] pieterh
|
I think 'weird' is high praise compared to 'boring', 'irrelevant', or 'ok'
|
[12:20] pieterh
|
people do speak highly of Erlang once they learn it
|
[12:20] pieterh
|
I'm too lazy and happily ignorant to make that effort
|
[12:20] pieterh
|
calling it 'weird' is much easier
|
[12:20] kristsk
|
same goes for lisp
|
[12:20] yrashk
|
;)
|
[12:20] yrashk
|
Erlang is actually a fantastic platform
|
[12:20] yrashk
|
I've used it stating in 2001
|
[12:21] yrashk
|
and I am always returning to it
|
[12:21] yrashk
|
from other platforms
|
[12:21] kristsk
|
addiction eh
|
[12:21] pieterh
|
it does seem to produce pretty good applications
|
[12:21] pieterh
|
lots of return on your development investment
|
[12:21] pieterh
|
but still...
|
[12:22] yrashk
|
it might be not very fast, but you know, only 20% of code impacts for 80% of slowdown
|
[12:22] pieterh
|
... weird :)
|
[12:22] yrashk
|
so you can rewrite bottlenecks even in C
|
[12:22] pieterh
|
just trolling, yrashk :)
|
[12:22] yrashk
|
and it's really to write some complicated stuff with it
|
[12:22] yrashk
|
ikr
|
[12:22] pieterh
|
I don't trust any language that's not been properly dead for 10 years
|
[12:22] kristsk
|
im kinda looking for some pet project to try erlang
|
[12:23] yrashk
|
see, I've been with erlang for 10 years and i still have hair ;)
|
[12:23] pieterh
|
true
|
[12:23] pieterh
|
did it grow faster after you learned Erlang, that's my question
|
[12:23] yrashk
|
kristsk: help porting socket.io server to erlang
|
[12:23] yrashk
|
pieterh_: of coure it did!
|
[12:23] yrashk
|
course*
|
[12:23] yrashk
|
kristsk: https://github.com/yrashk/socket.io-erlang
|
[12:23] kristsk
|
huh
|
[12:24] yrashk
|
I've barely started it but then again I don't have much time and my attention span is limited
|
[12:24] yrashk
|
erlang is like the best platform for socket.io
|
[12:24] kristsk
|
about same here
|
[12:24] kristsk
|
regarding attention span
|
[12:24] yrashk
|
:D
|
[12:24] yrashk
|
yet i manage to create something
|
[12:24] yrashk
|
this winter accounts for erlv8/beamjs and agner
|
[12:24] yrashk
|
not too much but at least something
|
[12:25] kristsk
|
"dear lord, what the heck is this? did i write this?"
|
[12:25] yrashk
|
hahaha
|
[12:25] yrashk
|
I experience this like almost every day
|
[12:27] yrashk
|
or at least every other day
|
[13:17] xet
|
hi
|
[13:22] sustrik
|
hi
|
[13:22] CIA-21
|
zeromq2: 03Martin Sustrik 07master * r1f536b2 10/ src/zmq_listener.cpp :
|
[13:22] CIA-21
|
zeromq2: Init object is child of listener
|
[13:22] CIA-21
|
zeromq2: This means that all the handshaking while accepting incoming
|
[13:22] CIA-21
|
zeromq2: connection is done exclusively in I/O threads, thus it won't
|
[13:22] CIA-21
|
zeromq2: overload the application thread's mailbox.
|
[13:22] CIA-21
|
zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/i0AFxk
|
[13:23] xet
|
any1 knows of a good realtime plotting lib to use with a zmq tcp socket?
|
[13:23] xet
|
realtime data stream i mean
|
[13:26] sustrik
|
no idea
|
[13:33] yrashk
|
so ok it looks like I am writing a new erlang binding
|
[13:33] yrashk
|
:-O
|
[13:33] yrashk
|
*facepalm*
|
[13:35] pieterh
|
yrashk, that'll make 3... :-)
|
[13:35] yrashk
|
:D
|
[13:35] pieterh
|
cremes: you around?
|
[13:43] yrashk
|
sustrik, pieterh_ I am trying to recall, is it safe to call zmq_msg_close after sending a message?
|
[13:43] yrashk
|
or should I better setup an ffn?
|
[13:44] yrashk
|
umm
|
[13:44] yrashk
|
I think I can do without this at all, I'll let erlang deal with deallocation
|
[13:44] yrashk
|
it's probably error prone tho
|
[13:45] sustrik
|
yrashk: it's safe
|
[13:45] pieterh
|
yrashk, yes
|
[13:45] sustrik
|
unless you use zmq_init_data()
|
[13:45] sustrik
|
in which case 0mq will take ownerwhip and deallocate the buffer you pass to it
|
[13:45] pieterh
|
which I'd propose to rename to zmq_init_data_are_you_really_sure ()
|
[13:46] sustrik
|
is should be zmq_sendmsg
|
[13:46] sustrik
|
and zmq_send should be zmq_send(void *data, size_t size, int flags);
|
[13:46] pieterh
|
there should be zmq_send (blob) (implied copy)
|
[13:46] pieterh
|
and zmq_sendz (message)
|
[13:46] yrashk
|
so should I do zmq_msg_init_size and then refer to a structure that is handled by erlang gc?
|
[13:46] yrashk
|
or init_size, then copy that binary
|
[13:47] yrashk
|
well, then I a need an ffn
|
[13:47] pieterh
|
yrashk, you need to copy, IMO
|
[13:47] yrashk
|
then init_data
|
[13:47] sustrik
|
yrashk: forget about zero copy for now
|
[13:47] pieterh
|
unless you can guarantee that Erlang won't free the data before zmq sends it
|
[13:47] sustrik
|
you can add it later if possible
|
[13:47] yrashk
|
I can't guarantee that
|
[13:47] yrashk
|
so when should I close that message?
|
[13:47] pieterh
|
then just copy
|
[13:47] pieterh
|
after sending
|
[13:48] yrashk
|
in case if I use init_data, too?
|
[13:48] pieterh
|
zmq_msg_init_size, memcpy, zmq_send, zmq_msg_close
|
[13:48] pieterh
|
do not use init_data
|
[13:48] pieterh
|
sustrik: this has to be explained more forcefully IMO
|
[13:49] yrashk
|
ok
|
[13:49] sustrik
|
yes, but API change is even more importatnt
|
[13:49] sustrik
|
unfortunately it can't be done till 3.0
|
[13:49] pieterh
|
idea: can we make the init method name match the send method?
|
[13:49] sustrik
|
?
|
[13:49] pieterh
|
so zmq_msg_init_zcopy and zmq_send_zcopy?
|
[13:49] sustrik
|
why so?
|
[13:49] pieterh
|
they are always used together, no?
|
[13:50] sustrik
|
there's only one send method
|
[13:50] pieterh
|
today, yes
|
[13:50] pieterh
|
oh... sorry... forget it
|
[13:50] sustrik
|
i would copy POSIX as close as possible
|
[13:50] pieterh
|
brainfart
|
[13:50] pieterh
|
anyhow, init_data is deceptively named IMO
|
[13:51] yrashk
|
will no zero copy seriously degrage perf?
|
[13:51] sustrik
|
only for very large messages
|
[13:51] pieterh
|
it looks like a normal sibling of init_size when in fact it's a weird spinach eating cousin
|
[13:51] sustrik
|
like 256MB or so
|
[13:51] yrashk
|
ok
|
[13:51] sustrik
|
pieterh_: yes, agreed, but we can't change it till 3.0
|
[13:51] yrashk
|
likely, seriously degrade?
|
[13:51] sustrik
|
shrug
|
[13:51] yrashk
|
like*
|
[13:51] pieterh
|
sustrik, np, can I add that to the roadmap page?
|
[13:52] sustrik
|
it's there
|
[13:52] pieterh
|
yrashk, for large messages, you will see a visible difference...
|
[13:52] sustrik
|
yrashk: http://www.zeromq.org/results:copying
|
[13:56] pieterh
|
sustrik, it wasn't there but I've added it and made a few other updates
|
[13:57] pieterh
|
3.0 is going to be fun... :-)
|
[13:57] pieterh
|
We're like MSFT, we get it right by version 3...
|
[13:58] sustrik
|
thanks
|
[13:58] sustrik
|
it's not obvious how to do the switch though
|
[13:59] sustrik
|
it's going to be painful
|
[14:00] pieterh
|
90% of people work via bindings
|
[14:00] pieterh
|
and bindings can handle both APIs, quite simply IMO
|
[14:01] pieterh
|
that made more sense before I wrote it :-(
|
[14:04] sustrik
|
:)
|
[14:08] yrashk
|
debugging FFIs i fun
|
[14:08] yrashk
|
is*
|
[14:08] yrashk
|
Socket operation on non-socket
|
[14:08] yrashk
|
:D
|
[14:08] sustrik
|
what's needed imo is transitory period
|
[14:08] sustrik
|
when both APIs would be available
|
[14:09] sustrik
|
presumably, with 2.0 API being a lightweight wrapper over 3.0 like internals
|
[14:10] yrashk
|
hmm errno 156384765
|
[14:11] yrashk
|
:d
|
[14:11] sustrik
|
see zmq.h
|
[14:11] yrashk
|
trying
|
[14:11] sustrik
|
ETERM
|
[14:11] sustrik
|
that means the context was already closed
|
[14:12] yrashk
|
ETERM
|
[14:12] yrashk
|
ya
|
[14:15] sustrik
|
mikko: are you here by chance
|
[14:15] sustrik
|
?
|
[14:15] mikko
|
yes
|
[14:16] sustrik
|
are there any changes needed to the build system in case the names of exported symbols change?
|
[14:16] mikko
|
shouldn't be
|
[14:16] sustrik
|
the visibility thing?
|
[14:16] mikko
|
thats per symbol
|
[14:16] mikko
|
take a look at zmq.h
|
[14:17] sustrik
|
ZMQ_EXPORT, right?
|
[14:17] mikko
|
yes
|
[14:17] sustrik
|
mikko: thanks
|
[14:17] mikko
|
are you looking to support two apis in one lib?
|
[14:17] sustrik
|
yep
|
[14:17] mikko
|
hmm
|
[14:17] sustrik
|
still the option of exporting _zmq_init etc. seems most viable
|
[14:18] mikko
|
why do you need to support two versions in one library?
|
[14:18] sustrik
|
and supplying simple zmq_init etc. wrappers in the header file
|
[14:18] sustrik
|
otherwise the migration to 3.0 API would have to be done in one go
|
[14:18] sustrik
|
which is almost impossible, given all the bindings etc.
|
[14:19] yrashk
|
heh
|
[14:19] yrashk
|
almost there!
|
[14:20] mikko
|
i dont know why but i have some doubts about the approach
|
[14:20] mikko
|
it sounds complex
|
[14:21] sustrik
|
any better idea?
|
[14:21] yrashk
|
ok
|
[14:21] yrashk
|
works
|
[14:21] yrashk
|
:]
|
[14:21] yrashk
|
that was fast
|
[14:21] sustrik
|
new binding?
|
[14:21] yrashk
|
yep
|
[14:21] pieterh
|
sustrik, transitional API sounds a good idea, it can be an optional header you include
|
[14:21] sustrik
|
your keyboard must have caught fire in the process :)
|
[14:21] yrashk
|
currently with block receive but I am going to solve this by using 2nd thread and 0mq inproc
|
[14:21] yrashk
|
sustrik: no, my kbd is cool
|
[14:22] pieterh
|
i think that's the a world record for a new binding
|
[14:22] yrashk
|
it's an incomplete binding
|
[14:22] yrashk
|
sockopts are not there yet
|
[14:22] yrashk
|
and non-blocking receive is not there yet, too
|
[14:22] yrashk
|
https://gist.github.com/6e203bc2aeb410b36532
|
[14:23] sustrik
|
still:
|
[14:23] sustrik
|
<yrashk> so ok it looks like I am writing a new erlang binding
|
[14:23] sustrik
|
at 14:33
|
[14:23] sustrik
|
15:21: works
|
[14:23] pieterh
|
under 1 hour...
|
[14:24] sustrik
|
:)
|
[14:24] sustrik
|
pieterh_: the API shift is doable even now
|
[14:24] pieterh
|
yup
|
[14:24] pieterh
|
it can be done with macros in zmq.h
|
[14:24] sustrik
|
but maybe forking the stable should be done first
|
[14:24] mikko
|
sustrik: i guess not
|
[14:25] pieterh
|
yes, no new features before we fork...
|
[14:25] sustrik
|
ack
|
[14:25] mikko
|
when is master going to be open for radical changes?
|
[14:25] yrashk
|
sustrik: https://github.com/yrashk/ezmq
|
[14:25] pieterh
|
mikko: are you in a hurry?
|
[14:26] mikko
|
pieterh_: no
|
[14:26] sustrik
|
yrashk: nice
|
[14:26] sustrik
|
any perf results?
|
[14:26] pieterh
|
easymq, hehe
|
[14:26] mikko
|
this is the openpgm autotools build change
|
[14:27] mikko
|
which also includes update from openpgm 5.0.x to 5.1.x
|
[14:27] pieterh
|
mikko: that does not risk introducing bugs IMO
|
[14:27] pieterh
|
i'd get that into the master asap
|
[14:27] mikko
|
well, openpgm update is potentially risky (?)
|
[14:28] pieterh
|
hmm, we didn't see any real issues with switch from openpgm2.x to 5.0.x...
|
[14:28] pieterh
|
afair
|
[14:29] pieterh
|
if you don't get that into master before we clone it, it'll be for 2.2 only
|
[14:29] pieterh
|
as you like
|
[14:29] mikko
|
ok
|
[14:29] sustrik
|
pieterh_: btw, are you not going to fork from a historic version?
|
[14:29] mikko
|
i'll have to check with steve-p
|
[14:29] mikko
|
steve-o
|
[14:29] sustrik
|
my feeling was that you want to fork pre-XPUB/XSUB version
|
[14:30] pieterh
|
sustrik, nope, my plan is to fork head and then make releases as we collect fixes to it
|
[14:30] sustrik
|
ok
|
[14:30] pieterh
|
we can disable XPUB/XSUB if needed (just the definitions, not the internals)
|
[14:30] pieterh
|
that's a 2.2 feature, I guess
|
[14:31] sustrik
|
it's just a placeholder atm
|
[14:31] pieterh
|
I've seen that
|
[14:31] pieterh
|
so how do I get stuff off sys://log
|
[14:31] pieterh
|
connect a SUB socket to it?
|
[14:31] sustrik
|
yes
|
[14:32] pieterh
|
ok, trying that now...
|
[14:36] cremes
|
pieterh_: what's up?
|
[14:36] pieterh
|
cremes: we found & nailed the real issue with mailboxes
|
[14:36] pieterh
|
of course it's a 1-line fix :-)
|
[14:36] cremes
|
oh, of course!
|
[14:37] pieterh
|
well, it's one line of magic from Martin...
|
[14:37] cremes
|
has it been committed already?
|
[14:37] pieterh
|
yup
|
[14:37] cremes
|
i'm going to look...
|
[14:39] cremes
|
launch_sibling became launch_child? that truly is magic
|
[14:39] yrashk
|
sustrik: no perfs yet, I haven't done 2nd thread non-block recv yet anyway
|
[14:39] yrashk
|
working on sockopts
|
[14:39] pieterh
|
sustrik, ok, it works like a charm
|
[14:40] pieterh
|
Compiling mailbugz...
|
[14:40] pieterh
|
Linking mailbugz...
|
[14:40] pieterh
|
E: (syslog) DUID: peer using non-unique identity - disconnected
|
[14:40] pieterh
|
E: (syslog) DUID: 0x444945
|
[14:40] pieterh
|
E: (syslog) DUID: "DIE"
|
[14:40] sustrik
|
yrashk: ack
|
[14:40] pieterh
|
that message comes from session.cpp, and object.cpp formats and prints the identity
|
[14:40] pieterh
|
and a thread in mailbugz catches and reports the error
|
[14:41] sustrik
|
pieterh_: it should be one log recoed imo
|
[14:41] sustrik
|
record
|
[14:41] pieterh
|
sustrik, sure, this is a proof of concept
|
[14:42] sustrik
|
in the future it would be nice to add IP address of the offending peer
|
[14:42] pieterh
|
and etc...
|
[14:42] pieterh
|
yes
|
[14:42] pieterh
|
my C++ skills are too poor to do this properly
|
[14:43] sustrik
|
use raw C
|
[14:43] pieterh
|
done that, np
|
[14:43] pieterh
|
ok, will add IP address and make a single string
|
[14:43] sustrik
|
the IP addr stuff may be tricky imo
|
[14:44] pieterh
|
does the engine not have a socket it can look at?
|
[14:44] sustrik
|
it does
|
[14:44] sustrik
|
but there no code to do that
|
[14:44] pieterh
|
ok, well, one step at a time
|
[14:44] sustrik
|
exactly
|
[14:44] pieterh
|
the first thing is to get end-to-end reporting in place
|
[14:44] pieterh
|
then we can extend and improve it...
|
[14:45] pieterh
|
not sure if bindings should read sys://log themselves or leave this to apps...
|
[14:46] pieterh
|
in latter case, people just won't use it and will still have the same 'what is happening?' issue
|
[14:46] sustrik
|
it's up to binding developers imo
|
[14:47] pieterh
|
some thought and consistent policy would help IMO
|
[14:49] sustrik
|
i would leave binding as thin as possible personally
|
[14:49] sustrik
|
=> no need to change anything
|
[14:55] pieterh
|
there is one very small issue with the syslog design
|
[14:55] pieterh
|
if you get an error right at startup, you won't see it
|
[14:56] pieterh
|
due to the async connect issue with pubsub
|
[14:57] sustrik
|
rightr
|
[14:57] pieterh
|
since I'm conflating errors (sending only the first of a series), this means sometimes nothing shows
|
[14:57] pieterh
|
I
|
[14:57] pieterh
|
I've no solution in mind but it's an interesting use case
|
[14:58] sustrik
|
are you subscribing for logs immediately after zmq_init()
|
[14:58] sustrik
|
?
|
[14:58] pieterh
|
yes
|
[14:59] sustrik
|
that should not happen imo
|
[14:59] sustrik
|
likely a bug...
|
[14:59] pieterh
|
hang on, there should be no delay over inproc...
|
[14:59] pieterh
|
i'm mistaken, it's because I'm using two threads
|
[14:59] pieterh
|
ok, easily solved, connect subscriber in main thread and then pass to background thread
|
[14:59] sustrik
|
ack
|
[15:00] pieterh
|
ack, that fixes it
|
[15:01] yrashk
|
when I am doing getsockopts on ZMQ_IDENTITY, should I allocate option value?
|
[15:02] yrashk
|
sorry for dumb questions, was working through the whole night
|
[15:02] sustrik
|
yrashk: you supply a buffer
|
[15:02] sustrik
|
0mq will fill the identity into the buffer
|
[15:03] yrashk
|
gothca, will do 255 buff
|
[15:05] yrashk
|
I am not sure I really understand this IDENTITY thing yet, though
|
[15:08] yrashk
|
I am getting garbage from there :S
|
[15:08] yrashk
|
even though I set it to a binary value beforehand
|
[15:08] yrashk
|
lol, debugging stuff at 7am is fun
|
[15:12] pieterh
|
yrashk, then tomorrow you're like 'wha...?'
|
[15:12] yrashk
|
well
|
[15:12] yrashk
|
that's a given
|
[15:12] yrashk
|
but anyway I fixed something
|
[15:12] yrashk
|
and it works
|
[15:13] yrashk
|
so what's left?
|
[15:13] yrashk
|
I guess non-blocking receive
|
[15:13] yrashk
|
and nicer API
|
[15:28] pieterh
|
sustrik, ok, patch sent for the syslog interface
|
[15:28] pieterh
|
it's pretty brutal, will benefit from some C++ review
|
[15:34] yrashk
|
horaay
|
[15:34] yrashk
|
even pub/sub aleady works
|
[15:35] sustrik
|
:)
|
[15:35] yrashk
|
https://gist.github.com/36cde07be28d3dad39a7
|
[15:35] yrashk
|
and I added nicer higher level API
|
[15:37] yrashk
|
I am actually porting those right now
|
[15:38] sustrik
|
!
|
[15:39] yrashk
|
it probably is much worse
|
[15:39] yrashk
|
since it's currenly blocking
|
[15:39] yrashk
|
haha
|
[15:39] yrashk
|
the same
|
[15:39] yrashk
|
mean throughput: 30938.745584228862 [msg/s]
|
[15:39] yrashk
|
still my API is nice :-P
|
[15:41] yrashk
|
I increased message size to 100 bytes
|
[15:41] yrashk
|
and now it's faster:
|
[15:41] yrashk
|
mean throughput: 40136.63795933773 [msg/s]
|
[15:42] sustrik
|
what's going on there?
|
[15:42] yrashk
|
1000 bytes gives 38K
|
[15:43] yrashk
|
so I wrote this nice NIF API for nothing? :-(
|
[15:43] sustrik
|
is it possible that erlang itself is so slow as not to be able to execute the loop faster than that?
|
[15:43] yrashk
|
actually 40K for 1000 bytes
|
[15:43] sustrik
|
still, it should b 400k
|
[15:43] sustrik
|
at least
|
[15:44] yrashk
|
true
|
[15:44] yrashk
|
I am thinking why this is happening
|
[15:44] yrashk
|
I cutted driver communications out
|
[15:44] yrashk
|
so now it's NIF
|
[15:44] sustrik
|
can you benchmark average duration spent in the NIF
|
[15:44] sustrik
|
?
|
[15:45] sustrik
|
say in send?
|
[15:45] yrashk
|
or may be even better in recv?
|
[15:45] yrashk
|
send should be faster anyway, no?
|
[15:45] yrashk
|
getting 44K on 1 bytes
|
[15:45] sustrik
|
recv is more problematic, because it can be slow either because it itself is slow or because sender is slow
|
[15:46] yrashk
|
added noblock to sender
|
[15:46] sustrik
|
so, if you are able to find out how much time is spent in send
|
[15:46] sustrik
|
it may turn out that it's 10%
|
[15:46] sustrik
|
that would mean that erlang itself is slow
|
[15:46] yrashk
|
well I can do that
|
[15:47] yrashk
|
still I will probably not delete my binding
|
[15:47] sustrik
|
or 95% which would mean erlzmq is slow
|
[15:47] yrashk
|
as it's quite nice actually
|
[15:47] sustrik
|
no, don't do that
|
[15:47] sustrik
|
it may come handy one day
|
[15:48] cremes
|
erlang isn't known for execution performance; it's known for massive concurrency
|
[15:49] cremes
|
if the problem isn't parallelizable, perf isn't usually too impressive
|
[15:49] yrashk
|
it spends 2-3 microseconds in that call
|
[15:49] yrashk
|
(recv)
|
[15:50] sustrik
|
that should mean 330,000-500,000 msgs/sec
|
[15:51] yrashk
|
may be my math is wrong?
|
[15:51] yrashk
|
;)
|
[15:51] sustrik
|
yes, the math there is tricky
|
[15:51] yrashk
|
but it does spend up to 20-30 seconds or so
|
[15:51] sustrik
|
due to rounding errors
|
[15:51] yrashk
|
for a million messages
|
[15:51] sustrik
|
than the result looks right
|
[15:52] yrashk
|
real 0m31.754s
|
[15:52] yrashk
|
user 0m14.646s
|
[15:52] yrashk
|
sys 0m18.121s
|
[15:52] sustrik
|
the problem is that most of the time is spent *out* of erlzmq
|
[15:52] yrashk
|
well it is ezmq now
|
[15:52] yrashk
|
I rewrote it from scratch
|
[15:52] yrashk
|
;)
|
[15:52] sustrik
|
:)
|
[15:52] sustrik
|
those 2-3 microseconds are spent in NIF
|
[15:52] sustrik
|
or 0MQ itself?
|
[15:53] yrashk
|
NIF call
|
[15:53] sustrik
|
can you check the same on the send() call?
|
[15:53] yrashk
|
sure
|
[15:54] yrashk
|
10-15 microseconds
|
[15:54] yrashk
|
well even 8-15
|
[15:55] sustrik
|
with 40k msgs/sec
|
[15:55] sustrik
|
the time for single message is 25 usec
|
[15:55] yrashk
|
a little less without noblock
|
[15:55] sustrik
|
8-15 can be attributed to ezmq
|
[15:55] sustrik
|
remaining 10 are either erlang itself or the time measurement
|
[15:56] yrashk
|
still 8-15 is pretty fast, right?
|
[15:56] sustrik
|
it equals to 125k-666k msgs/sec
|
[15:57] yrashk
|
the almost empty cycle takes about 890645 microsecond
|
[15:58] sustrik
|
1M iterations?
|
[15:58] yrashk
|
yep
|
[15:58] yrashk
|
well, it's not empty
|
[15:58] yrashk
|
but almost
|
[15:58] sustrik
|
0.8 usec per iteration
|
[15:58] sustrik
|
that's reasonable
|
[15:58] sustrik
|
so what's the rest of the time spent on?
|
[15:58] yrashk
|
a little bit slower with integer value in it
|
[15:58] yrashk
|
I have no idea yet
|
[15:59] sustrik
|
cycle is 25 used long
|
[15:59] sustrik
|
8-15usec in NIF
|
[15:59] sustrik
|
0.8 the iteration itself
|
[15:59] sustrik
|
still some 10 usecs are missing
|
[15:59] yrashk
|
yup
|
[15:59] yrashk
|
you mean on send, right?
|
[15:59] sustrik
|
yes
|
[16:00] yrashk
|
right
|
[16:00] yrashk
|
I have no idea :)
|
[16:00] sustrik
|
maybe send is fast
|
[16:00] sustrik
|
how long does sending 1000000 msgs take?
|
[16:00] sustrik
|
measured before calling zmq_term()
|
[16:01] sustrik
|
(zmq_term waits for messages to be actually pushed to the wire)
|
[16:01] yrashk
|
lets see
|
[16:01] sustrik
|
(what we are interested in is how fast is erlang/ezmq able to push messages to 0mq)
|
[16:01] yrashk
|
I am measure the whole script timing
|
[16:02] yrashk
|
measuring*
|
[16:02] yrashk
|
25 secds
|
[16:02] yrashk
|
secs
|
[16:02] yrashk
|
real 0m25.188s
|
[16:02] yrashk
|
user 0m26.410s
|
[16:02] yrashk
|
sys 0m22.533s
|
[16:03] sustrik
|
that's before zmq_term is called, right?
|
[16:04] sustrik
|
that's 40k msgs/sec i.e. 25 usec per message
|
[16:04] yrashk
|
no, not before
|
[16:05] mikko
|
are you setting linger to 0
|
[16:05] mikko
|
?
|
[16:06] yrashk
|
def not explicitly
|
[16:06] sustrik
|
right, setting linger to 0 would cause measuring only pushing the messages to 0mq
|
[16:06] yrashk
|
just got 52K messages per sec
|
[16:06] yrashk
|
also, 14249556 microseconds to send
|
[16:06] yrashk
|
before term
|
[16:07] yrashk
|
that is 14 seconds
|
[16:07] yrashk
|
now even 13
|
[16:07] sustrik
|
ie. 14 usec per message
|
[16:07] yrashk
|
and 67K messages
|
[16:07] yrashk
|
and I am not changing anything
|
[16:07] sustrik
|
with 8-15 in NIF
|
[16:07] sustrik
|
that sounds more or less reasonable
|
[16:08] sustrik
|
assuming that 8 is an outlayer
|
[16:08] yrashk
|
well I cant measure exactly NIF timing, but it's very close to that
|
[16:08] sustrik
|
and average values are around 14-15
|
[16:08] yrashk
|
avg seemed to be around 11-13
|
[16:08] yrashk
|
anyway I am getting even 67K messages
|
[16:08] sustrik
|
that seems ok
|
[16:08] yrashk
|
I was not able to see that with erlzmq
|
[16:09] sustrik
|
14.9 usec/msg
|
[16:09] sustrik
|
can you do ezmq/c perf test as before?
|
[16:10] sustrik
|
ezmq local + c remote
|
[16:10] sustrik
|
and vice versa?
|
[16:10] yrashk
|
sure just let me finish 10mln messages test
|
[16:10] sustrik
|
sure
|
[16:10] yrashk
|
still running...
|
[16:11] yrashk
|
that's likely to be like 3 mins
|
[16:11] yrashk
|
:D
|
[16:11] yrashk
|
whoa
|
[16:11] yrashk
|
mean throughput: 71015.7477562681 [msg/s]
|
[16:11] yrashk
|
71K
|
[16:12] yrashk
|
with C remote:
|
[16:12] yrashk
|
mean throughput: 119761.54040167542 [msg/s]
|
[16:13] yrashk
|
twice as good as it was with erlzmq btw
|
[16:13] yrashk
|
:-P
|
[16:13] yrashk
|
mean throughput: 126762.65050054133 [msg/s]
|
[16:14] yrashk
|
it looks like it warms up more every time I run it again
|
[16:14] yrashk
|
=_
|
[16:15] sustrik
|
heh
|
[16:16] yrashk
|
one thing I know for sure
|
[16:16] yrashk
|
it's faster than erlzmq
|
[16:16] yrashk
|
I just retested C/erlzmq: it tops at 80K/msgs
|
[16:16] yrashk
|
while C/ezmq tops at about 120K msgs
|
[16:17] sustrik
|
nice
|
[16:18] yrashk
|
and I ony spent... like 2 hours with it?
|
[16:18] sustrik
|
so maybe just write an email about your experiments to the ML
|
[16:18] sustrik
|
and have some sleep
|
[16:18] yrashk
|
I am too lazy to write emails right now
|
[16:18] sustrik
|
sure
|
[16:18] yrashk
|
well hopefully ezmq will be of use for somebody
|
[16:18] yrashk
|
since it's faster
|
[16:19] sustrik
|
we can even have a look at why the call to NIF takes 15 usec later on
|
[16:19] yrashk
|
yeah it's quite long
|
[16:19] yrashk
|
C version takes no time at all
|
[16:19] sustrik
|
something like 0.3 usec
|
[16:20] sustrik
|
so there's quite a lot of overhead that can be possibly eliminated
|
[16:20] yrashk
|
for all messages
|
[16:20] sustrik
|
for 1-byte messages
|
[16:20] yrashk
|
right
|
[16:20] yrashk
|
also may be it is not batching for some reason in this scenario?
|
[16:20] yrashk
|
could that happen?
|
[16:21] sustrik
|
batching does not occur only if messages are passed slower than the underlying network stack can send them
|
[16:21] sustrik
|
which can be the case here
|
[16:22] sustrik
|
however, if we speed up the erlzmq part
|
[16:22] sustrik
|
the batching will kick in
|
[16:22] yrashk
|
I think this is the case
|
[16:22] yrashk
|
we send whole packets for every byte
|
[16:23] sustrik
|
possible
|
[16:23] sustrik
|
but only way to make it not happen is to speed up the sender
|
[16:23] sustrik
|
i.e. erlang+ezmq
|
[16:24] yrashk
|
yup
|
[16:25] yrashk
|
yet I have no idea why this is so slow
|
[16:25] yrashk
|
well,erlang is indeed not very fast
|
[16:25] yrashk
|
but still
|
[16:27] yrashk
|
this code https://github.com/yrashk/ezmq/blob/master/c_src/ezmq_nif.c#L250 looks clean enough by me
|
[16:27] yrashk
|
to not cause *too* much trouble
|
[16:30] sustrik
|
yes, this is how most bindings look like
|
[16:30] sustrik
|
erlzmq is exceptional in its complexity
|
[16:30] yrashk
|
mine is much simpler, eh?
|
[16:30] sustrik
|
yup
|
[16:30] yrashk
|
and faster [grin]
|
[16:30] sustrik
|
hi5!
|
[16:44] yrashk
|
hm may be binary allocation isn't that fast
|
[16:44] yrashk
|
there is a way to tell erlang to use existing memory region as a binary without allocating it
|
[16:45] yrashk
|
but it means it will get into erlang gc
|
[16:45] yrashk
|
and should not be altered until erlang collects it
|
[16:45] yrashk
|
sustrik: is there a way to do that with 0mq?
|
[16:46] yrashk
|
should I not close message on recv and only close in a finalizer called by erlang gc?
|
[16:47] sustrik
|
yrashk: i don't follow
|
[16:47] sustrik
|
inside the NIF you get a message
|
[16:47] yrashk
|
yes
|
[16:47] yrashk
|
then I copy and close it
|
[16:47] sustrik
|
sounds ok
|
[16:47] yrashk
|
instead there is a possibility that I can use another API (that I haven't used before)
|
[16:47] yrashk
|
that allows me to reuse existing memory region as a binary
|
[16:48] yrashk
|
instead of copying it and allocating binary
|
[16:48] yrashk
|
and I can attach a destructor to such a binary
|
[16:48] yrashk
|
which can call msg_close
|
[16:48] yrashk
|
I am just thinking how good or bad this approach is
|
[16:49] sustrik
|
it's not bad, but it'll get pretty complex soon
|
[16:49] sustrik
|
i would just stick with the existing copy semantics at the moment
|
[16:49] yrashk
|
I am just trying to understand where's the bottleneck
|
[16:50] yrashk
|
btw it looks like ezmq got more or less consistently 2x faster than erlzmq
|
[16:50] yrashk
|
with some minor changes
|
[16:50] yrashk
|
at least I am constantly getting 60-70K
|
[16:50] sustrik
|
nice
|
[16:51] sustrik
|
handling native memory from VM tends to be pretty messy
|
[16:51] yrashk
|
I know
|
[16:51] sustrik
|
anyway, 15 usec for an allocation seems too much
|
[16:51] yrashk
|
well
|
[16:51] yrashk
|
I don't know what exactly takes 15 usec
|
[16:51] sustrik
|
i would rather guess scheduling is kicking in or something like that
|
[16:52] yrashk
|
I am just reading the code and fantasizing
|
[16:52] yrashk
|
well when it enters C part
|
[16:52] yrashk
|
there is no scheduling happening whatsoevr
|
[16:52] sustrik
|
doesn't the erlang thread get into wait state
|
[16:53] sustrik
|
while dedicated NIF thread does the work?
|
[16:53] yrashk
|
well current scheduler waits until C function ends
|
[16:53] yrashk
|
there's no dedicated NIF thread
|
[16:53] yrashk
|
it's in the scheduler's thread
|
[16:53] sustrik
|
ok
|
[16:53] sustrik
|
that should not take 15 usec either
|
[16:54] yrashk
|
no
|
[16:54] yrashk
|
I am really at loss here
|
[16:54] sustrik
|
maybe just locking and unlocking a mutex or somesuch
|
[16:54] yrashk
|
I don't quite get it
|
[16:54] sustrik
|
~1 usec
|
[16:54] yrashk
|
NIFs are pretty much like BIFs, implemented directly in the VMs opcode interpreter
|
[16:55] sustrik
|
can we do measurements in both erlang and ezmq?
|
[16:55] sustrik
|
1. before NIF is called
|
[16:55] sustrik
|
2. when C code is invoked
|
[16:56] sustrik
|
3. when C code is ready
|
[16:56] yrashk
|
btw time meas. takes time too
|
[16:56] sustrik
|
4. when erlang resumes execution
|
[16:56] sustrik
|
yes
|
[16:56] sustrik
|
something like readeing TSC is preferable
|
[16:56] sustrik
|
however, i am not sure it's available from erlang
|
[16:56] yrashk
|
TSC?
|
[16:57] sustrik
|
processor tick count
|
[16:58] sustrik
|
but we don't to do that now
|
[16:58] sustrik
|
let's use just standard time measurement
|
[16:58] sustrik
|
even if single measurement takes 1 usec
|
[16:59] sustrik
|
the original 15 usecs should still be visible
|
[17:01] yrashk
|
ya
|
[17:02] yrashk
|
I usually prefer to think about potential cause of a problem, sometimes logic beats brutal force
|
[17:02] yrashk
|
:D
|
[17:04] sustrik
|
my experience with tuning is that in most cases you are very surprised when you find out where the bottleneck is
|
[17:05] yrashk
|
true
|
[17:07] yrashk
|
damn my macbook air is much slower
|
[17:08] yrashk
|
...than my dual quad core mac pro :D
|
[17:09] guido_g
|
how comes?
|
[17:09] yrashk
|
very weird
|
[17:09] guido_g
|
in deed
|
[17:09] yrashk
|
in bed
|
[17:09] guido_g
|
sustrik: any pgm changes between 2.1.0 und the current git?
|
[17:10] guido_g
|
updated zmq to git master and pgm stopped working
|
[17:10] guido_g
|
pgm receive
|
[17:10] sustrik
|
what's the problem?
|
[17:10] guido_g
|
epgm messages are not received anymore
|
[17:11] guido_g
|
restored the 2.1.0 from the tar and everything was back
|
[17:12] guido_g
|
plain amd64 linux (2.6.37) box
|
[17:15] sustrik
|
can you report the problem?
|
[17:15] guido_g
|
sure
|
[17:15] sustrik
|
thanks
|
[17:15] guido_g
|
going to write small tests
|
[17:16] sustrik
|
post a note to the ML so that steve-o is aware of the problem
|
[17:16] guido_g
|
maybe it's the python binding...
|
[17:16] guido_g
|
yepp
|
[17:16] sustrik
|
not likely imo
|
[17:16] guido_g
|
no, because i just switched libzmq
|
[17:17] yrashk
|
anyway I am off for a nap
|
[17:17] yrashk
|
sustrik: thanks a lot for your help!
|
[17:17] guido_g
|
good night!
|
[17:17] yrashk
|
more like morning here already
|
[17:17] yrashk
|
:D
|
[17:17] yrashk
|
thanks anyway
|
[17:32] sustrik
|
good night
|
[17:40] eyeris
|
Am I doing something incorrectly here, or is that "Resource temporarily unavailable" message related to some actual resource being in use? I've checked that port 5563 is not in use. What other resource could it be referring to? http://pastebin.com/SdxBiFAk
|
[17:40] guido_g
|
ouch
|
[17:41] guido_g
|
the test sender runs into an assert
|
[17:41] guido_g
|
Assertion failed: rc == 0 (connect_session.cpp:82)
|
[17:43] eyeris
|
guido_g: in my code? I don't see that
|
[17:43] guido_g
|
no
|
[17:51] guido_g
|
mail sent
|
[17:58] guido_g
|
eyeris: it's a timing problem
|
[17:58] guido_g
|
the io thread simply does not have enough time to connect
|
[17:59] guido_g
|
eyeris: https://gist.github.com/832259 <- works
|
[18:02] eyeris
|
I understand the idea behind what you're saying. In fact, I tried the sleep(1) before the loop before and it didn't work. Using your gist I get an error: Assertion failed: *tmpbuf > 0 (src/zmq_decoder.cpp:60)
|
[18:03] guido_g
|
ouch
|
[18:04] guido_g
|
i'm on zeromq 2.1.0 from the tarball and the latest pyzmq (git master)
|
[18:05] eyeris
|
I am on 2.0.10 from pypi
|
[18:05] guido_g
|
and zeromq version?
|
[18:05] eyeris
|
Same
|
[18:06] guido_g
|
explains the assertion
|
[18:06] guido_g
|
try to update zeromq to 2.1.0 (from the tarball)
|
[18:15] eyeris
|
Alright, I've built and installed 2.1.0 into /opt/zeromq
|
[18:15] eyeris
|
I can't figure out how to make pyzmq find it though
|
[18:16] eyeris
|
There doesn't seem to be a --with-zeromq (or similar) build switch
|
[18:16] guido_g
|
from where did you get zeromq before?
|
[18:16] eyeris
|
pypi
|
[18:17] guido_g
|
not pyzmq
|
[18:17] eyeris
|
http://pypi.python.org/pypi/pyzmq-static/2.0.10
|
[18:18] guido_g
|
ouch
|
[18:18] eyeris
|
That includes the lib
|
[18:19] guido_g
|
get rid of it and use the non-static version
|
[18:20] guido_g
|
install zeromq using the default settings (using /usr/local/lib)
|
[18:20] guido_g
|
then install the non-static pyzmq
|
[18:34] eyeris
|
Works now
|
[18:34] eyeris
|
Thanks
|
[18:55] fbarriga
|
hello everyone
|
[18:57] fbarriga
|
there is any trade off on using zmq for non critical latency ? (use it only because is comfortable)
|
[19:03] pieterh
|
hi fbarriga
|
[19:03] fbarriga
|
hi
|
[19:04] pieterh
|
0MQ makes great soup no matter how you slice it
|
[19:04] pieterh
|
its speed is just a bonus when you need it
|
[19:04] pieterh
|
what's your use case?
|
[19:06] fbarriga
|
because here I have a program (A) that receive data and send it to another (B) to process it. So I want to monitor B in a gui (C) and I don't need a fancy way to send data. But as I'm already using zmq for comunicate A and B, it can be a good idea to use it to comunicate with the gui
|
[19:06] pieterh
|
sure
|
[19:09] fbarriga
|
has anyone tested the latency of using an point-to-point channel v/s publisher/subscriber ?
|
[19:10] pieterh
|
it's the same, unless you ping-pong back
|
[19:14] fbarriga
|
but any difference ? not even 1 uS ?
|
[19:15] pieterh
|
over the same transport? depends how many messages you are sending, how large they are
|
[19:15] pieterh
|
it does not depend on the socket type
|
[19:16] fbarriga
|
using unix socket, thousands of msg with a size of 32 bytes
|
[19:16] pieterh
|
fbarriga: have you read the documentation yet? the Guide?
|
[19:17] fbarriga
|
umm a kind RTFM ?
|
[19:17] pieterh
|
no, I want to know how much you know about 0MQ
|
[19:17] fbarriga
|
I read it but not completely
|
[19:18] pieterh
|
'point to point channel' is not a 0MQ term
|
[19:18] fbarriga
|
Not too much, I have 2 publishers, one on C++ and other on python feeding a C++ program
|
[19:19] pieterh
|
i think your questions are beside the point
|
[19:19] pieterh
|
(latency)
|
[19:19] pieterh
|
if you are not actually worried about usecs, it's irrelevant what I say
|
[19:20] staylor
|
something I'm not very clear on, if I have a server bound to an XREQ with multiple clients bound to REP how exactly could I send a message to a specific client from the server side? I'm not very clear on how the addressing works here.
|
[19:20] pieterh
|
the usual rule is measure it, don't trust others' opinions
|
[19:20] pieterh
|
staylor: read Ch3 of the Guide
|
[19:20] pieterh
|
you need an XREP socket to do routing, and we're going to rename that a ROUTER socket
|
[19:21] fbarriga
|
I'm worried about uS. The python publisher uses bulk historic data (no latency problem). The C++ publisher receives marketdata in realtime
|
[19:21] pieterh
|
fbarriga: ok, that's good
|
[19:21] pieterh
|
you can try this yourself, it's trivial, two test programs in C++
|
[19:21] pieterh
|
try pubsub and push-pull
|
[19:21] pieterh
|
measure the latency
|
[19:22] fbarriga
|
yes, I was working in that way, but trying to avoid work asking here =)
|
[19:22] staylor
|
I'll look at it again, but out of quick curiosity would messages go to all REP sockets and be filtered or only to the specific client?
|
[19:22] pieterh
|
fbarriga: go do your own homework :-)
|
[19:22] pieterh
|
come back when you have results for us
|
[19:22] fbarriga
|
but is not too easy, event getting the time add some delay
|
[19:23] fbarriga
|
ok, thanks for the help
|
[19:23] pieterh
|
fbarriga: see how the latency test programs do it
|
[19:24] pieterh
|
staylor: clients are usually REQ, not REP
|
[19:24] pieterh
|
a REP socket is really for workers
|
[19:25] pieterh
|
are the names confusing?
|
[19:25] staylor
|
I might be going at this all wrong, but what I'm looking for is clients that connect to a server and wait for work requests (clients perform a task for the server)
|
[19:25] pieterh
|
call them 'workers', for sanity's sake :-)
|
[19:26] staylor
|
alright, so workers connect to the server which then sends them requests
|
[19:26] pieterh
|
it round-robins them
|
[19:26] pieterh
|
that's what an XREQ/DEALER socket does
|
[19:26] pieterh
|
deals the cards out
|
[19:26] staylor
|
in my scenario I need to send work to specific workers, that's where I'm unclear about how to identify them
|
[19:27] fbarriga
|
(any chance to use inproc for two different process ? like shared memory between processes?)
|
[19:27] pieterh
|
staylor: so, read Ch3, it is explained in exhaustive detail
|
[19:27] pieterh
|
you need REQ sockets for the workers, and XREQ to route to specific workers
|
[19:28] pieterh
|
fbarriga: nope, not unless you're able to hack together a shmem transport for 0MQ
|
[19:28] staylor
|
I think I see where I misunderstood the docs, I'll go over chap3 again
|
[19:28] staylor
|
thank you
|
[19:28] pieterh
|
np
|
[19:29] fbarriga
|
anyone has tried ?
|
[19:29] pieterh
|
don't think so
|
[19:29] pieterh
|
it would be fun to do but not portable
|
[20:16] mikko
|
evenin'
|
[20:26] staylor
|
is there a way to know if a worker is connected or not, or should that be let to the application to send acknowledgement requests?
|
[20:26] staylor
|
for example knowing when the underlying socket is broken?
|
[20:28] Guthur
|
staylor, the application will have to take care of the reliability
|
[20:29] staylor
|
alright, thank you
|
[20:57] drbobbeaty
|
sam: I'm not one of the main guys, but I'm using epgm and if I can answer your question, I'll give it a try. Can you repeat it (or cut-n-paste it) again?
|
[20:57] Guthur
|
sam`, you might try the Mailing list as well
|
[21:01] drbobbeaty
|
Wow... 100MB... Honestly, 0MQ is not what I'd be using - not that it's bad, but there are system-level tools to do this far more efficiently. For example, rdist on unix... you set up a "tree" of your datacenters where the 'source' sends to 4 boxes, each of those 4 send to 4 more, etc. Pretty soon it's all done.
|
[21:01] drbobbeaty
|
As for the source-specific multicast, that's not really possible in the 0MQ world because the sender is masked by the OpenPGM/URL scheme.
|
[21:01] lt_schmidt_jr
|
Upon advice from the experts I have grabbed the 2.1.0 tarball instead of the 2.0.10 I have been using with homebrew on OSX
|
[21:01] lt_schmidt_jr
|
I am seeing the following error:
|
[21:02] lt_schmidt_jr
|
Making check in tests
|
[21:02] lt_schmidt_jr
|
make check-TESTS
|
[21:02] lt_schmidt_jr
|
PASS: test_pair_inproc
|
[21:02] lt_schmidt_jr
|
PASS: test_pair_ipc
|
[21:02] lt_schmidt_jr
|
PASS: test_pair_tcp
|
[21:02] lt_schmidt_jr
|
PASS: test_reqrep_inproc
|
[21:02] lt_schmidt_jr
|
PASS: test_reqrep_ipc
|
[21:02] lt_schmidt_jr
|
PASS: test_reqrep_tcp
|
[21:02] lt_schmidt_jr
|
Invalid argument
|
[21:02] lt_schmidt_jr
|
nbytes != -1 (tcp_socket.cpp:197)
|
[21:02] lt_schmidt_jr
|
FAIL: test_shutdown_stress
|
[21:02] lt_schmidt_jr
|
sorry for the paste fail
|
[21:02] lt_schmidt_jr
|
so test_shutdown_stress fails when I run make check
|
[21:02] lt_schmidt_jr
|
should I worry?
|
[21:03] drbobbeaty
|
sam`: If you want to double-check me on the source specific multicast, send something to the mailing list. As for the 100MB file, I think there are better ways - especially since multicast is not routable so you'd have to make "tunnels" and that's a lot of work. There are tools/utilities geared for this for unix.
|
[21:04] drbobbeaty
|
lt_schmidt_jr: this is a known issue with 2.1.0 but I *believe* it's fixed in the master (HEAD) on the git repo.
|
[21:05] lt_schmidt_jr
|
drbobbeaty: thanks I think I will hold off, given the issues with the HEAD revision I see on the list
|
[21:06] drbobbeaty
|
sam`: I can almost certainly guarantee that it's not going to route through the internet. If you have a private WAN, then yes, it can be set up to have one switch send the data to another, but if you're betting on the plain old net, I think you're going to find it isn't.
|
[21:08] drbobbeaty
|
sam`: OK.. if you have private WANs then you can route it. However, my point on the best tool for the job stands. You can build something with 0MQ and use epgm, but I think there are better tools like rdist already written that make this so much easier.
|
[21:09] drbobbeaty
|
sam`: OK... I'm missing something. If you have 100MB to move from A to B, then you can put that 100MB in a file, use rdist, load it at the other end. Right? If not, what am I missing about your requirements?
|
[21:12] drbobbeaty
|
sam`: If it were put in as a file, then the 1000 machines at B can all read it at the same time. Maybe what you need is an NFS mount for all machines at B to share?
|
[21:15] drbobbeaty
|
sam`: if you have a primary box with a 10GbE NIC (or two), then it'll be a load, but not a lot.
|
[21:16] drbobbeaty
|
sam`: you know your set-up best, and maybe this isn't a workable solution. If that's the case, then sure, use 0MQ.
|
[21:16] drbobbeaty
|
sam`: You can write a "sender" and a "receiver" PUB/SUB and then make sure it's routed, and off you go.
|
[21:16] drbobbeaty
|
The examples in The Guide are really about all you need.
|
[21:19] drbobbeaty
|
sam`: But it's not "free"... if you have a failure, and a NACK is sent, you'll have to send that packet again - and with a WAN, the possibility goes up for that to happen.
|
[21:19] drbobbeaty
|
sam`: you are going to need a networking expert for this to be really efficient and workable.
|
[21:29] drbobbeaty
|
sam`: As for the source-specific multicast, that's not really possible in the 0MQ world because the sender is masked by the OpenPGM/URL scheme.
|
[21:30] drbobbeaty
|
sam`: but ask the mailing list to double-check me on this. I just don't see how with the APIs I've been using.
|
[21:55] sustrik
|
lt_schmidt_jr: can you possibly find out what's the errno when the test fails?
|
[21:57] lt_schmidt_jr
|
let me see
|
[21:58] mikko
|
sustrik: steve-o suspects that openpgm with the changes needed for zeromq integration will be released next week around thu-fri
|
[21:59] lt_schmidt_jr
|
sustrik: this is what I am seeing: Invalid argument\n Nbytes != -1 (tcp_socket.cpp:197)
|
[22:01] lt_schmidt_jr
|
sustrik: I see what you are asking for, let me try to instrument the source
|
[22:02] sustrik
|
ah, invalid argument
|
[22:02] sustrik
|
i've missed that
|
[22:02] sustrik
|
sorry
|
[22:02] sustrik
|
mikko: great
|
[22:04] sustrik
|
hm, POSIX doesn't allow for send() returning EINVAL
|
[22:05] sustrik
|
Linux docs are not very helpful either:
|
[22:05] sustrik
|
"EINVAL Invalid argument passed."
|
[22:06] sustrik
|
lt_schmidt_jr: is that linux?
|
[22:06] lt_schmidt_jr
|
sustrik: mac osx
|
[22:07] sustrik
|
can you check "man send"
|
[22:07] sustrik
|
and look for EINVAL?
|
[22:07] lt_schmidt_jr
|
checking
|
[22:08] lt_schmidt_jr
|
sustik: [EINVAL] The sum of the iov_len values overflows an ssize_t.
|
[22:09] sustrik
|
funny
|
[22:10] sustrik
|
that applies to sendmsg
|
[22:10] lt_schmidt_jr
|
yes
|
[22:10] sustrik
|
in this case its simple send that returns EINVAL
|
[22:10] lt_schmidt_jr
|
send() does not mention EINVAL in man
|
[22:11] lt_schmidt_jr
|
sustrik: where did you see EINVAL?
|
[22:13] sustrik
|
"Invalid argument"
|
[22:13] sustrik
|
= EINVAL
|
[22:14] matott
|
I'm starting with zmq. As an overlay network zmq it uses messages and doesn't provide a streaming api. Are there any fragmentation libraries to emulate streaming?
|
[22:16] lt_schmidt_jr
|
sustrik: this is odd, I have added a printf just before the failing assert to see the error number, and the failure changed to mailbox.cpp
|
[22:16] matott
|
I had a look at mongrel, but it seemed to me from the demos that the applications had to handle it itself.
|
[22:17] sustrik
|
what assert?
|
[22:18] lt_schmidt_jr
|
Nbytes != -1 (tcp_socket.cpp:197)
|
[22:18] sustrik
|
in mailbox.cpp?
|
[22:20] lt_schmidt_jr
|
I may be very confused, but the tcp_scoket.cpp:197 contains: errno_assert (nbytes != -1);
|
[22:21] sustrik
|
"this is odd, I have added a printf just before the failing assert to see the error number, and the failure changed to mailbox.cpp"
|
[22:22] lt_schmidt_jr
|
but once I try to print the the errno in tcp_socket.cpp just before errno_assert in tcp_socket.cpp, the failure becomes: Assertion failed: nbytes == sizeof (command_t) (mailbox.cpp:193)
|
[22:22] sustrik
|
aha
|
[22:22] sustrik
|
that's a known problem with OSX
|
[22:23] sustrik
|
OSX has small default socket buffers
|
[22:23] sustrik
|
and resizing them seems to be broken
|
[22:23] cremes
|
sustrik: right... i have my little local "patch" that solves that for me on osx
|
[22:23] cremes
|
maybe lt_schmidt_jr can use the same thing until it gets fixed in master
|
[22:23] sustrik
|
yup, it would be nice to solve it in systemic way
|
[22:24] sustrik
|
however, i have no OSX box...
|
[22:24] sustrik
|
as for the tcp_socket/EINVAL error that's plain strange
|
[22:24] sustrik
|
it looks like the OS is returning undocumented error :(
|
[22:25] lt_schmidt_jr
|
hey, its like being back on Windows
|
[22:25] lt_schmidt_jr
|
:)
|
[22:25] sustrik
|
shrug
|
[22:26] lt_schmidt_jr
|
well, its ok, if I can get a patch, that would be great - we would be deploying on CentOS
|
[22:26] lt_schmidt_jr
|
if I can get by for develpment
|
[22:28] lt_schmidt_jr
|
cremes: what version does you patch ... patch?
|
[22:29] cremes
|
lt_schmidt_jr: it's against master
|
[22:29] cremes
|
http://article.gmane.org/gmane.network.zeromq.devel/7146/match=old%5fsndbuf
|
[22:30] lt_schmidt_jr
|
cremes: thanks
|
[22:31] lt_schmidt_jr
|
Should I be using master? I just went from 2.0.10 which homebrew installed to 2.1.0 tarball
|
[22:31] lt_schmidt_jr
|
based on your suggestion
|
[22:31] cremes
|
lt_schmidt_jr: you should be; however, that little patch i just gave you should also work on 2.0.10 if you are happy with that release
|
[22:33] lt_schmidt_jr
|
Thanks. I will stay with the 2.1.0 tarball - I have already fixed the java tests - they were not exiting since sockets were not closed before context.term()
|
[22:33] lt_schmidt_jr
|
My question is more 2.1.0 tar on the zmq page vs latest git
|
[22:34] lt_schmidt_jr
|
a little nervous about HEAD revision
|
[22:37] cremes
|
lt_schmidt_jr: HEAD is likely more stable than 2.1.0 tarball; *lots* of fixes including another mailbox assert on osx
|
[22:38] cremes
|
given a choice between the tarball and master, use master
|
[22:38] lt_schmidt_jr
|
cremes: thank you
|
[22:38] cremes
|
you are welcome
|
[22:38] cremes
|
if you have fixes for the java tests, please contribute them back
|
[22:46] lt_schmidt_jr
|
I have not been getting responses from gonzalo on the mvn issues, so not sure how to contribute (its just 4 lines of socket.close())
|
[22:46] lt_schmidt_jr
|
to be fair to gonzalo, first he was gone, then he responded, I was gone, now he is gone ...
|