ZeroMq IRC Log

Thursday September 29, 2011

[Time] Name	Message
[01:16] buddy1	'lo there
[04:36] creatstar	hi all, i got a question here, i'm try to install ruby binding on mac os, with the option --with-zmq-dir=/usr/local, but it always says couldn't find zmq library
[04:37] creatstar	but the libzmq.a is right at the directory /usr/local
[06:01] creatstar	seems that it's a issue only on the Mac os
[06:37] sustrik	creatstar: please fill a bug report with ruby binding project
[07:06] CIA-79	libzmq: 03Jon Dyte 07master * r34b114d 10/ src/router.cpp : Make sure new ROUTER socket honours POLLIN for cmd messages ...
[07:09] alfred1	morning all (from the EET timezone)
[07:23] sustrik	morning
[08:13] NoRuLeS	hi, does anyone know is there any example code for uploading file client?
[08:15] guido_g	Ã¸mq is not like tcp or even http
[08:16] guido_g	it's used for transmitting distinct messages, not a byte stream
[08:17] NoRuLeS	thats what i thought after reading the zeromq guide
[08:18] NoRuLeS	so there's no way to send files using zeromq right?
[08:22] eintr	files?
[08:23] eintr	that's strange on so many levels
[08:23] alfred1	NoRuLeS: I'm not a zeromq expert, but I think you can very well send files
[08:23] alfred1	if those are text, then this is clear..
[08:24] alfred1	if binary then I would base64 encode them
[08:24] guido_g	sure, you could even play music and stream videos
[08:24] alfred1	(leaving the rest to the experts)
[08:24] eintr	NoRuLeS: a message is an array of byte arrays , put whatever you want in it.
[08:25] guido_g	that's stupid, why do you want to encode binary data into text? not even http requires that
[08:25] eintr	NoRuLeS: but for streaming, you need to control - or at least provide on the consuming side - ordering, which is out of scope for a messaging library.
[08:25] NoRuLeS	seems like i have to read furthermore
[08:26] NoRuLeS	will get back to you guys soon =]
[08:26] eintr	NoRuLeS: no big deal though, you can just add a sequence number and piece together the order on the receiving side (and handle failure, deduplication etc etc)
[08:26] eintr	NoRuLeS: or just use a tcp stream ;)
[08:26] guido_g	simulating tcp over Ã¸mq over tcp...
[08:26] eintr	guido_g: no, using tcp.
[08:27] guido_g	"<eintr> NoRuLeS: no big deal though, you can just add a sequence number and piece together..."
[08:27] eintr	guido_g: tcp does more than sequencing though. much more.
[08:27] guido_g	oh really?
[08:28] eintr	guido_g: retry? checksum? connection state?
[08:29] guido_g	so, why did you suggest the seq. number thing then?
[08:29] eintr	guido_g: because [s]he asked.
[08:30] guido_g	he asked for file transfer
[08:30] eintr	guido_g: what i said was if you want ordering in disjoint zeromq messages you need to do it yourself... this is #zeromq, not #interweb
[08:31] eintr	guido_g: first thing i said as response was also "(10:22:57 AM) eintr: files?
[08:31] eintr	(10:23:09 AM) eintr: that's strange on so many levels" .. so what exactly is your point?
[08:33] guido_g	so the point was?
[08:34] eintr	guido_g: whatever. don't be an ass by picking on ppl for giving on-topic advice.
[08:37] guido_g	advice that doesn't match the level of the receiver and is misleading
[08:37] mile	are there any benchmarks of (e)pgm vs tcp?
[08:37] guido_g	mile: doesn't make much sense because the use-cases are so different
[08:38] mile	true..
[08:38] guido_g	mile: pgm is used in pub/sub scenarios w/ lots of participants
[08:38] mile	are there any bandwith issues?
[08:39] guido_g	waht do you mean?
[08:39] mile	I would need it for a handfull of participants
[08:39] mile	exchanging lots of data
[08:39] guido_g	then stick w/ tcp
[08:39] eintr	mile: for high throughput to few consumers you're good with tcp, no fuss
[08:40] mile	I wouldn't take pgm for better performance, but rather for simplicity
[08:41] mile	if there are no big performance penalties
[08:41] mile	but tcp would also be possible
[08:42] guido_g	if you can be absolutely sure that you networks admins can and are willing to deal w/ multicast, then try it, but i predict you'll get results faster w/ tcp
[08:42] mile	k, thanks :)
[08:43] mile	I think I'll go with multicast for initiation, with less data
[08:43] mile	and then let the components open dedicated connections
[08:44] guido_g	why?
[08:44] mile	it reduces the amount of a-priori knowledge of network topology
[08:44] mile	I'm aiming for a simple swarm-like behavior
[08:44] guido_g	ahhh so differnt use-case
[08:45] guido_g	yes, using a bus-like structure for configuration makes sense
[08:45] guido_g	still you need to keep an eye on the pitfalls of multicast
[08:45] mile	such as?
[08:45] guido_g	some routers do have problems
[08:46] guido_g	most co-location providers do not allow it
[08:46] mile	I hope for the best, we will have a couple of servers in a rack
[08:46] guido_g	funny things like ec2 are completely ruled out
[08:46] mile	so the multicast would be internal only
[08:47] mile	and I thought of building a proxy if a need to go over LAN ever occurs
[08:48] guido_g	i still would suggest tcp, you can switch later on when you understand the application better (i.e. the application exists and is running)
[08:58] aphadke	hi guys, m a newbie to zeromq, trying to compile the jzmq binding with zeromq-3.0.1 and one of the tests fails for jzmq "test_queue" ... any ideas?
[09:35] CIA-79	libzmq: 03Martin Sustrik 07master * r7a10bbe 10/ src/mtrie.cpp : Bug in subscription matching fixed (issue 263) ...
[10:15] mikko	sustrik: alive?
[10:15] sustrik	mikko: hi
[10:15] sustrik	here i am
[10:16] mikko	cool
[10:17] mikko	i'm seeing pretty weird linger behaviour
[10:17] mikko	im running a script using 2.1 zeromq (github master)
[10:17] mikko	and if i sleep at the end of the script all my messages get there
[10:17] mikko	but as i am not setting linger i would expect them to get out anyway
[10:17] sustrik	yes
[10:18] mikko	or is there a possible race condition where things have been pushed to tcp buffers but not yet set
[10:18] mikko	sent*
[10:18] sustrik	what's exactly the problem?
[10:18] mikko	i send 1k messages
[10:18] sustrik	if you don't sleep, the messages don't get over?
[10:18] mikko	and ~950 go to other end if i dont sleep
[10:18] mikko	if i sleep (1) all messages get there
[10:18] sustrik	do you zmq_term() at the end of the script?
[10:19] mikko	yes
[10:19] sustrik	then it's a bug
[10:19] mikko	let me break on that just in case
[10:19] mikko	Breakpoint 1, zmq_term (ctx_=0x1018791a0) at zmq.cpp:287
[10:19] sustrik	ok
[10:19] mikko	b zmq_term is hit
[10:19] sustrik	if you have a test to reproduce it, i can give it a look
[10:25] mikko	no simple test case
[10:25] mikko	yet
[10:25] mikko	noticed it last night when debugging pzq
[10:29] sustrik	ack
[10:45] ntelford	I remember a while ago we had a conversation about this
[10:45] ntelford	but ZMQ sockets don't allow you to find out the number of messages waiting in their queue do they?
[10:45] mikko	ntelford: nope
[10:45] ntelford	:(
[10:46] ntelford	shame
[10:46] ntelford	it'd be a useful metric for monitoring
[10:46] ntelford	so you can know when your application isn't consuming messages fast enough
[10:55] ntelford	discussing it with Ben, he said that the HWM will cause client to block when reached
[10:55] ntelford	which mitigates the issue of queues not being consumed fast enough
[10:55] ntelford	however, with PUB/SUB sockets, the HWM causes messages to be discarded silently
[10:57] sustrik	what you want afaiu is a notification that the local queue is almost full, right?
[11:09] ntelford	sustrik, well, ideally we could just monitor the size of it at regular intervals - that way it would integrate in to any monitoring/alerting platform
[11:11] sustrik	what would you do if you found out the queue is almost full?
[11:11] guido_g	i wanted to implement something linke that in the v2 code some time ago, but failed miserably
[11:12] mikko	sustrik: i don't think it's even about that
[11:13] mikko	sustrik: you don't need automated reaction necessarily
[11:13] mikko	but rather trends
[11:13] mikko	helping to identify bottlenecks
[11:13] sustrik	ok
[11:13] guido_g	a basis for capacity planning and provisioning etc.
[11:13] sustrik	couple of issue to solve to get that kind of thing:
[11:14] sustrik	1. each socket can have multiple queues. How to report that? Sums?
[11:15] sustrik	2. each underlying connection has 4 buffers (0mq tx, tcp tx, tcp rx, 0mq rx)
[11:15] guido_g	2) is known, discussed and not relevant for the use-case
[11:15] sustrik	which one would you report then?
[11:16] mikko	return a vector of queue sizes and current fillage
[11:16] guido_g	would be very good, because you know at the same time how many connections are active
[11:16] sustrik	ok, that's an option
[11:16] sustrik	what about the 2?
[11:17] sustrik	which buffer size is exactly to be reported?
[11:17] guido_g	the Ã¸mq level only
[11:17] guido_g	the remainder is covered by os-lvel monitoring
[11:17] sustrik	just the local one?
[11:17] sustrik	tx on sender, rx on receiver?
[11:18] guido_g	right
[11:19] guido_g	we don't have a lot of bi-directional socket tpyes, do we?
[11:19] sustrik	req/rep
[11:20] sustrik	that can be solved by having two options instead of one, np
[11:20] guido_g	but still, a flag indicating rx or tx stats would be enough
[11:20] guido_g	right
[11:20] sustrik	what i meant rather is whether the value to be reported should be "all the messages on fly" or "all the messages stored in the local buffer"
[11:21] sustrik	the former is not really possible imo
[11:21] guido_g	other products even deliver latency histograms
[11:21] guido_g	just number of messages queued for send/receive
[11:21] guido_g	per connection if necessary
[11:21] sustrik	wrt latency -> you need timestamps in the messages for that
[11:22] sustrik	which is slowish
[11:22] guido_g	also something to get the connection information (peer ip, port etc.)
[11:22] guido_g	wllm isn't that slow, right?
[11:22] sustrik	wllm?
[11:23] guido_g	ibms websphere low latency messaging
[11:24] sustrik	dunno, never measured it, still, generating a timestamp using synchronised time is not a cheap operation
[11:24] guido_g	don'T get me wrong, i dont want that latency thingie in Ã¸mq
[11:24] sustrik	anyway, if you want to implement that kind of monitoring, i can help you when you run into a problem
[11:25] sustrik	i am still not convinced but you are free to try
[11:25] guido_g	i was just referring to the features of another product, which is a competitor to Ã¸mq (or vice versa)
[11:25] sustrik	ack
[11:26] guido_g	the queue size and connection monitorin otoh is essential in my eys, but you know that already
[11:26] sustrik	yes
[11:26] sustrik	imo just go for it
[11:26] guido_g	unfortunately i couldn'T find the right points in v2 code base to get that implemented
[11:27] sustrik	ok, let me see
[11:27] guido_g	are the newer version easier to understand?
[11:27] sustrik	they are a bit better
[11:27] sustrik	but not exactly simple
[11:28] guido_g	ok
[11:28] guido_g	but given that there is no documentation on the parts play together one needs help to understand the basic structure
[11:30] sustrik	there's the architecture document
[11:30] sustrik	have you read that?
[11:31] sustrik	guido_g: ok, found it
[11:31] sustrik	check the 2.1 trunk
[11:31] sustrik	files src/pipe.hpp and src/pipe.cpp
[11:32] sustrik	each pipe (queue) consists of 2 objects: the reader and the writer
[11:32] sustrik	currently, the writer has msgs_read and msgs_written variables
[11:32] sustrik	the difference between the two is current queue size estimate
[11:33] guido_g	ok
[11:33] sustrik	it doesn't work the other way round
[11:33] sustrik	on the reader side there's msgs_read, but no msgs_written
[11:33] sustrik	meaning that you have no estimate
[11:34] sustrik	there's a bit to implement to get that
[11:34] sustrik	how it works is that writer has idea about the number of messages written (++ for each send)
[11:34] guido_g	ok
[11:35] sustrik	the number of messages read is delivered to the writer once in a while from the reader
[11:35] guido_g	aha
[11:35] sustrik	it's done using activate_writer command
[11:35] sustrik	see line 303
[11:35] sustrik	to get an estimate on the reader
[11:36] sustrik	you would have to send msgs_written from writer to reader occasionally
[11:36] sustrik	it's basically the same thing, just other way round
[11:37] sustrik	side note: it would be also preferable to process any pending commands when doing getsockopt(QUEUE_SIZES)
[11:37] sustrik	as that would process any outstanding activate_writer commands
[11:37] sustrik	and thus make the estimate more precise
[11:38] guido_g	the pipe_full method uses that
[11:38] guido_g	so basically it'S already there
[11:38] sustrik	yes
[11:38] sustrik	on the writer side
[11:39] sustrik	you still have to do the same thing in the other direction
[11:39] guido_g	ahhh no hwm chech on the reader side?
[11:39] sustrik	yup, you don't need to check hwm when reading a message
[11:41] sustrik	i mean, when you read a message from a queue
[11:41] guido_g	ah ok
[11:41] sustrik	there's no danger you'll hit HWM
[11:41] guido_g	ahhh ... i think i got it
[11:42] guido_g	there is a pipe for every direction a socket supports
[11:42] guido_g	when writing the pipe writer counts the outgoing message
[11:43] sustrik	yes
[11:43] guido_g	on the incoming side, the writer is on the network side of the pipe
[11:43] guido_g	counting the messages received
[11:43] sustrik	yes
[11:43] guido_g	good
[11:43] guido_g	we only need the two writers then
[11:44] guido_g	query the queue size (as done in the pipe_full method) and we're set
[11:44] sustrik	you want current amount of messages, not the actual max sequence number written
[11:44] sustrik	so you have to have both last seq num written and last seq num read
[11:44] sustrik	the difference is the queue size
[11:44] guido_g	right, as in pipe_full method
[11:45] sustrik	yup
[11:45] guido_g	we just extrasct the calculation into a method and use that
[11:45] guido_g	*extract
[11:45] sustrik	yes
[11:45] sustrik	but that works only for writer
[11:45] guido_g	but every pipe has a writer
[11:46] sustrik	sure, but in you are a reader you have no access to the writer
[11:46] sustrik	for example, for inbound messages
[11:46] guido_g	ouch
[11:46] sustrik	the application thread is the reader
[11:46] sustrik	and I/O thread is the writer
[11:47] sustrik	so, in the application thread you have clear idea of what is the last read seq num
[11:47] guido_g	so we need a way to get that value w/o blocking either thread
[11:47] sustrik	but you have to find out last written seqnum with the I/O thread
[11:47] sustrik	yes
[11:47] sustrik	now it works only in one direction
[11:48] guido_g	practicality beats pruity, as the zen of python says
[11:48] sustrik	reader sends max read seqnum to writer
[11:48] guido_g	i would say it's much more important to get the ide of the send queue size
[11:48] guido_g	which is in reach
[11:48] sustrik	yup, that's pretty easy
[11:48] guido_g	becaue we can acces the writer for sends
[11:49] sustrik	you should still consider the precision issue
[11:49] sustrik	the command notifying writer about last read seq num is sent "once in a while"
[11:49] guido_g	these values are estimates, people using them know that (at least thy should)
[11:49] sustrik	which basically means every N messages
[11:50] sustrik	precision of the estimate is determined by N
[11:50] sustrik	the N itself is computed from HWM
[11:50] guido_g	the whole idea is to get some data to build upon
[11:50] sustrik	ok
[11:51] sustrik	you have the idea how it works now
[11:51] sustrik	if you encounter any further problems feel free to consult me
[11:51] guido_g	thx, sure will do
[12:03] sustrik	mikko: are you still there?
[12:06] guido_g	in v3 there is no pipe_full method anymore?
[12:07] guido_g	ahh inlined into check_write
[12:11] guido_g	poor boy :)
[13:06] mikko	sustrik: i'm seeing very strange pattern
[13:06] sustrik	yes?
[13:07] mikko	im receiving message with ROUTER socket
[13:07] mikko	and sending back ack
[13:07] mikko	each way it's multipart
[13:07] mikko	but for some reason it seems that my ACK messages get batched
[13:07] mikko	and sent in one chunk at the end
[13:07] mikko	i added some debug code to the device and this is the traffic coming from client:
[13:07] mikko	https://gist.github.com/6cc87324af218432633c
[13:08] mikko	out to in is from client to server (ack)
[13:08] mikko	and in to out is from server to client
[13:08] mikko	unless i sleep at the end of the client process that big chunk of ACKs is lost
[13:10] sustrik	ok, let's make a simple test case for it then
[13:16] mikko	sustrik: i can't seem to get it out on simple case
[13:17] sustrik	ok, what about the complex case?
[13:17] sustrik	can i reproduce it?
[13:18] mikko	sustrik: yes
[13:18] sustrik	ok, can you create a ticket with the steps to reproduce?
[14:01] sundbp	cremes: line 68 in util.rb - this is always false or am i missing something? looks convoluted as is
[14:03] sundbp	cremes: because line 61 already means the second term of line 68 is always false
[14:04] cremes	sundbp: i'll take a look at it in about an hour... in the middle of something else at the moment
[14:05] sundbp	cremes: cool. i started to play with removign some of the exception raising in favor of true/false (or nil) + errno
[14:07] sundbp	cremes: makes sense for me to let constructors for context and socket raise, and for Message to raise internally. None of those APIs are direct analogies to C API. things like #bind, #connect, #close, etc though would be nice if they didn't raise
[14:31] sundbp	cremes: question - there's a send_and_close() method. is that one necessary given that zmq_send takes ownership of the message? the ruby obj doesn't need to be closed explictly, can be GC'd
[14:34] sundbp	from zmq_send: The zmq_msg_t structure passed to zmq_send() is nullified during the call.
[14:34] sundbp	does that mean the memory is freed or just that it's set to zeros?
[14:39] sundbp	(v2.1.9 docs that is)
[14:48] cremes	sundbp: regarding util.rb, you are correct
[14:48] cremes	but it kind of doesn't matter since that code is going away
[14:49] cremes	also, i agree with you about the constructors raising while the send/recv code just uses return codes
[14:53] sundbp	cremes: cool. yes, agreed - wasn't going to use that code anymore, disappearing.
[14:54] sundbp	cremes: pushed my first few changes to my github fork if you want to have a look. fixed a but with send_strings as well - was a range given as [0..-1] that should be [0..-2]
[14:55] sundbp	cremes: just playing with it a bit, not necessarily suggesting you want to do anything with it (apart from that range bug)
[14:59] cremes	sundbp: there isn't a bug in #send_strings
[15:00] cremes	it uses three dots for the range which means the final value is not included
[15:00] cremes	check for yourself
[15:00] cremes	it's using parts[0...-1] which is the same as parts[0..-2]
[15:01] sundbp	cremes: ah - i've never seen 3 dots used before.
[15:02] sundbp	cremes: then false alarm - thot it was weird :)
[15:02] cremes	it's a weird ruby-ism
[15:03] cremes	parts[0..-2] is probably a bit clearer though... i'll fix it today as i rip this thing apart
[15:04] sundbp	cremes: yeah, hadn't seen before. prefer 2 dots to keep the indices consistent
[15:04] sundbp	cremes: i.e. [0..-2] and then [-1]
[15:04] cremes	right
[15:05] cremes	already fixed :)
[21:08] cremes	sundbp: rewrote everything to return explicit return codes and updated the docs
[21:08] cremes	now i just need to modify the examples to use the new api and i'll push my changes to github
[21:08] cremes	i'd appreciate it if you would test them out (all specs are green against 2.1.x and 3.x)
[21:30] dhyams	hello, I'm probably not understanding some fundamentally,
[21:30] dhyams	but how might I check to see if there is actually someone listening on the other side of a connection?
[21:31] dhyams	.e.g in Python . if I call sock.send('foo')
[21:31] dhyams	how do I know I will ever get a reply? (this is a REQ/REP socket)
[21:32] minrk	in general, you don't
[21:32] minrk	but if a req socket has no peer, send will block
[21:33] dhyams	thanks minrk; so there is no way for me to avoid the send blocking? That's what I'm trying to get around...just hangs my program
[21:33] minrk	sure there is
[21:34] minrk	send(msg,flags=zmq.NOBLOCK) will raise EAGAIN if it can't send
[21:34] minrk	or you can poll with zmq.POLLOUT, to see if the socket is ready to send
[21:35] dhyams	thanks much....actually, I thought it was blocking on the following recv, but looking for the EAGAIN will work just great
[21:35] dhyams	[I'm only a day old at network programming, lol]
[21:35] dhyams	thanks for your help...still learning the ropes.
[21:36] minrk	sure thing - I'm still learning, too
[21:40] dhyams	hmm, what I'm seeing is that the send still proceeds just fine even with zmq.NOBLOCK....perhaps it only blocks with local queues are to full to send?
[21:41] dhyams	It's blocking on the recv that never comes, so I'll move the NOBLOCK flag to there and see what happens
[21:41] dhyams	and timeout manually with a sleep, I guess
[21:41] minrk	if you want to block for a period of time, use poll
[21:41] minrk	that's what it's there for
[21:41] dhyams	ok
[21:42] minrk	p = zmq.Poller()
[21:42] minrk	p.register(sender, zmq.POLLOUT)
[21:42] minrk	p.register(receiver, zmq.POLLIN)
[21:43] minrk	evts = p.poll(1000) # timeout is in milliseconds
[21:44] minrk	etc.
[21:44] minrk	poll will return immediately when at least one registered event is available, or return an empty list if the timeout is reached and no events are available
[21:45] dhyams	ok
[21:45] dhyams	thanks yet again

ZeroMq Home

Thursday September 29, 2011