ZeroMq IRC Log

Monday February 21, 2011

[Time] Name	Message
[02:54] JStoker	Hi. Is there any debian packages for zeromq? :)
[02:56] JStoker	Aah. Thanks, I'll try to work out a way of coaxing that into my squeeze install. :)
[02:57] JStoker	Ah. Hm.
[02:58] JStoker	The debian machine works fine with compiling, I guess. The ubuntu one on the other hand seems to have a bit of a dependency failure, so isn't able to actually compile it itself.
[02:59] JStoker	Requires uuid-dev, but "libuuid1 (= 2.17.2-0ubuntu1) but 2.17.2-0ubuntu1.10.04.2 is to be installed"
[03:00] JStoker	I just tried `apt-get install uuid-dev`, didn't specify a version :/
[03:02] JStoker	I've just forced it to a different version, which has seemed to help... (it's installing, at least)
[03:17] JStoker	That's the ub... oh. I just went and built my own package out of git://github.com/zeromq/zeromq2.git. Whoops.
[03:17] JStoker	seb`, Question. Are 2.0.10 and 2.1.x compatable?
[03:18] JStoker	That does report itself as 2.1.1 though, so no issues. :)
[03:20] JStoker	oh well. Got the difficult system running with 2.1 anyway, so might as well continue with it on the debian one.
[10:23] CIA-21	zeromq2: 03Martin Sustrik 07master * r5c09311 10/ (src/pgm_socket.cpp src/pgm_socket.hpp):
[10:23] CIA-21	zeromq2: Computation of buffer size for PGM fixed.
[10:23] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/i9Ptxs
[10:23] pieter_hintjens	sustrik, I'm seeing zmq_term just exit the process
[10:24] pieter_hintjens	does that sound normal?
[10:24] sustrik	no
[10:24] sustrik	do you have a test case
[10:24] sustrik	?
[10:24] pieter_hintjens	trying to make some of the Guide examples work with 2.1...
[10:24] pieter_hintjens	yes, I have a test case... I'll email it to you
[10:24] sustrik	thanks
[10:25] pieter_hintjens	https://gist.github.com/836902
[10:26] pieter_hintjens	also some problems transferring messages but I'll break that down
[10:27] sustrik	ok
[10:28] mikko	good morning
[10:28] sustrik	morning
[10:29] pieter_hintjens	hi mikko :-)
[10:34] pieter_hintjens	sustrik: you didn't add my log on unroutable messages...
[10:35] sustrik	nope
[10:35] pieter_hintjens	any particular reason?
[10:35] sustrik	it's a standard behaviour
[10:35] sustrik	nothing special to log
[10:35] sustrik	it's like dropping an IP packet
[10:35] pieter_hintjens	i violently disagree
[10:36] pieter_hintjens	debugging dropped messages is one of the major pains
[10:36] pieter_hintjens	xrep dropping a malformed envelope is not like dropping an IP packet
[10:37] pieter_hintjens	it's a sign of application error, and that information is particularly valuable to the developer
[10:37] sustrik	yes, we should log malformed envelope
[10:37] pieter_hintjens	that is what I had done...
[10:37] sustrik	not dropping a well-formed envelope though
[10:37] pieter_hintjens	malformed = no such address, sorry
[10:37] sustrik	well-formed message i mean
[10:37] sustrik	it's on critical path
[10:38] pieter_hintjens	errors are exceptional;
[10:38] sustrik	if you need it for debugging
[10:38] sustrik	we can add some special debug functionality
[10:38] pieter_hintjens	when you start to use xrep this is a major pain
[10:38] pieter_hintjens	errors -> silence
[10:38] sustrik	yes
[10:38] pieter_hintjens	the only way I know to debug is to hack 0MQ to print something
[10:38] pieter_hintjens	it's a major annoyance
[10:38] sustrik	it means that the original requester is no longer available
[10:38] pieter_hintjens	no
[10:38] sustrik	so the reply is dropped
[10:38] pieter_hintjens	it means there's a bug in my app code
[10:39] sustrik	?
[10:39] pieter_hintjens	i assume you've never written code that uses xrep then
[10:39] sustrik	i did
[10:39] sustrik	queue device :)
[10:39] pieter_hintjens	well, I've done a lot of work using xrep
[10:39] pieter_hintjens	and it's tricky
[10:40] pieter_hintjens	and the classic 80% error is a bad envelope (valid for 0MQ but where the routing information is wrongly placed)
[10:40] pieter_hintjens	0MQ absolutely must report this
[10:40] pieter_hintjens	ask on the list if you're not sure
[10:40] pieter_hintjens	it's not a critical path
[10:40] pieter_hintjens	there are not millions of malformed addresses per second
[10:40] pieter_hintjens	no way
[10:40] sustrik	how would you distinguish it from simply dropping the packet because the destination is simply dead atm
[10:40] mikko	pieter_hintjens: this is pretty classic as well http://bit.ly/g6RXaS?r=td
[10:40] pieter_hintjens	same thing, sustrik
[10:40] mikko	might want to feature in the guide
[10:40] pieter_hintjens	not critical path, that's the first thing
[10:41] pieter_hintjens	this is an exception and can be logged
[10:41] pieter_hintjens	mikko: yes, ipc has some gotchas
[10:41] pieter_hintjens	i'm trying to debug a case now and the 'silence when dropping messages' just cost me an hour
[10:41] pieter_hintjens	bleh
[10:42] sustrik	let's add some debugging code then
[10:42] pieter_hintjens	sustrik: why? you have a mechanism that's ideal
[10:42] pieter_hintjens	as a user I'm asking formally that syslog be used to report bad addresses
[10:42] sustrik	the problem is that a node being offline is a pretty common occurence
[10:42] pieter_hintjens	it will also help debug topology errors
[10:42] sustrik	at least on wan
[10:42] pieter_hintjens	'pretty common' is not a critical path
[10:42] pieter_hintjens	wan is not our environment today
[10:43] pieter_hintjens	these are bogus arguments afaics, sorry
[10:43] sustrik	i'm planning for the future
[10:43] pieter_hintjens	you're making it unnecessarily difficult for the present
[10:43] pieter_hintjens	at which point the future becomes less certain at all
[10:43] pieter_hintjens	if people cannot easily debug routing issues and topology issues they will lose confidence in 0MQ
[10:44] pieter_hintjens	if you actually hit this theoretical issue of millions of WAN disconnections
[10:44] pieter_hintjens	then you can solve that problem in the correct fashion
[10:44] pieter_hintjens	refusing to log this information today is not a solution
[10:44] pieter_hintjens	you would probably want to reduce the logging level on specific sockets
[10:44] pieter_hintjens	in 3 years' time
[10:44] sustrik	ok, the real problem is that once i add that kind of thing, people start using it to drive their business logic
[10:45] pieter_hintjens	it's syslog
[10:45] pieter_hintjens	syslog cannot be used for presence
[10:45] pieter_hintjens	we know this, it's not a risk
[10:45] sustrik	which means the applications would become not scalable outside of a lan
[10:45] pieter_hintjens	?
[10:45] pieter_hintjens	sorry, let's kill inproc then
[10:45] pieter_hintjens	and multicast
[10:45] sustrik	?
[10:45] pieter_hintjens	cause people might depend on that
[10:45] pieter_hintjens	and not be scalable outside the lan
[10:46] pieter_hintjens	seriously?
[10:46] sustrik	nope, you can scale inproc and multicast by replacing them with tcp
[10:46] pieter_hintjens	sigh
[10:46] sustrik	when you start using logs in your business logic
[10:46] pieter_hintjens	how?
[10:46] sustrik	you are screwed
[10:46] pieter_hintjens	how do you use logs in your business logic?
[10:46] sustrik	as the logs are local
[10:46] pieter_hintjens	syslog is async
[10:46] sustrik	1. send x to y
[10:47] pieter_hintjens	you cannot rely on it for business logic
[10:47] pieter_hintjens	we already discussed this on the list
[10:47] sustrik	2. if i get "non routable" error
[10:47] sustrik	3. send it somewhere else
[10:47] sustrik	yes, i've discussed it many times
[10:47] pieter_hintjens	so people can abuse a tool
[10:47] pieter_hintjens	that does not mean you throw away the tool
[10:47] sustrik	and each time people asked for this kind of thing
[10:47] pieter_hintjens	you educate them
[10:47] sustrik	i discussed it with them
[10:48] pieter_hintjens	education, not censorship martin
[10:48] sustrik	and after some conversation it become obvious thay want to drive their business logic
[10:48] pieter_hintjens	you do not actually know what people need
[10:48] sustrik	none a single person wanting just logging
[10:48] pieter_hintjens	i want just logging
[10:48] sustrik	it's 100% misuse rate
[10:48] sustrik	99% then
[10:49] sustrik	ok, an idea
[10:49] sustrik	what about having a "debug" build which would log this kind of errors
[10:49] pieter_hintjens	and if people want to use routing failures in their app, why not?
[10:49] sustrik	and release build that won't
[10:49] sustrik	?
[10:49] pieter_hintjens	hang on
[10:50] sustrik	it doesn't scale
[10:50] pieter_hintjens	I'm challenging your assertion that using this information won't scale
[10:50] pieter_hintjens	if I make a broker that uses this
[10:50] pieter_hintjens	the broker will scale fine
[10:50] pieter_hintjens	it's like getting disconnection alerts
[10:51] sustrik	wait a sec
[10:51] sustrik	let me explain
[10:51] sustrik	say you have a server (XREP)
[10:51] sustrik	and clients (XREQ)
[10:52] sustrik	the server uses sys://log to find out whether clients are disconnected
[10:52] pieter_hintjens	yes?
[10:52] sustrik	when it finds out, it will do something relevant, like allert the operator or somesuch
[10:52] pieter_hintjens	you mean ... log a message somewhere else?
[10:53] pieter_hintjens	on the operator console, for example...
[10:53] sustrik	i mean, initiate some other business process
[10:53] sustrik	like sending the message by DHL
[10:53] pieter_hintjens	? this is your example?
[10:53] pieter_hintjens	business logic would be
[10:53] sustrik	or placing a message on dlq
[10:53] pieter_hintjens	- discover principal connection is dead
[10:53] pieter_hintjens	- switch over to backup connection
[10:53] sustrik	or running a special repair script
[10:54] sustrik	any business logic will do
[10:54] sustrik	now
[10:54] sustrik	the company grows
[10:54] sustrik	and have several offices all over the world
[10:54] sustrik	it wants to federate
[10:54] sustrik	so they place a queue device at each location
[10:55] pieter_hintjens	sustrik, you have device obsession
[10:55] sustrik	now imagine an client at point A invoking a service at point B
[10:55] pieter_hintjens	people do not, ime, use the standard devices except as a stepping stone while they're learning 0MQ
[10:55] sustrik	the connection to A may be perfectly ok, but the client may be dead
[10:55] pieter_hintjens	they build their own devices, = brokers
[10:55] sustrik	the same thing
[10:56] sustrik	the problem is that the "cannot route" error works in hop-by-hop fashion
[10:56] pieter_hintjens	i really can't swallow your chain of logic
[10:56] pieter_hintjens	it is based on the assumption that all 0MQ applications today will one day scale to the WAN
[10:56] sustrik	so the error you get is 'cannot route to next hop'
[10:56] sustrik	instead of 'message is unroutable'
[10:56] pieter_hintjens	which is true for about 0.001% of cases, maybe
[10:57] pieter_hintjens	99.999% of today's apps (and those written for the next two years) will do what they do, and basta
[10:57] pieter_hintjens	that is our constituency
[10:57] sustrik	if IP folks would think that way Internet would interconnect the original 3 machines still :)
[10:57] pieter_hintjens	trying to stop people writing those apps any way they can is counter-productive
[10:57] pieter_hintjens	i'm not a luddite, please
[10:58] pieter_hintjens	and you are NOT reinventing the Internet, don't imagine that
[10:58] pieter_hintjens	I'm asking for basic tools to make today's problem slightly easier to solve
[10:58] sustrik	nope, i'm building a new layer for it
[10:58] sustrik	ack
[10:58] pieter_hintjens	you ignore the requests of your users at your direct peril
[10:58] sustrik	i know
[10:58] pieter_hintjens	people will not give much patience to a nerd who tells them "I know your problems better than you do"
[10:59] sustrik	there are several options how to solve the problem
[10:59] pieter_hintjens	you need people to bring their problems and expertise
[10:59] pieter_hintjens	to experiment, even if it means blowing things up
[10:59] pieter_hintjens	without that, what you make will be dead and unused
[10:59] pieter_hintjens	just a collection of weird names and bizarre theories
[10:59] pieter_hintjens	that's 90% of the software universe
[10:59] sustrik	i would suggest you fork the stable
[11:00] pieter_hintjens	not yet
[11:00] sustrik	and apply the patch there
[11:00] pieter_hintjens	i cannot run my basic examples on 2.1
[11:00] pieter_hintjens	and I'm not going to fork 0MQ... nope
[11:00] sustrik	a regrestion?
[11:00] sustrik	what happened?
[11:00] pieter_hintjens	stabilization = current version + patches from you
[11:00] pieter_hintjens	no more or less
[11:01] pieter_hintjens	I'd like to request some debugging framework (syslog is frankly quite heavy) that tells me when I make the stupid and repeated error of misconstructing a routing envelope
[11:01] sustrik	well, the problem is we have different goals
[11:01] pieter_hintjens	we have different timeframes, is all
[11:01] pieter_hintjens	i'm actually using 0MQ and finding difficulties with it
[11:01] sustrik	how can we possibly synchronise?
[11:02] pieter_hintjens	accept the problems your users present you as real
[11:02] sustrik	sure they are
[11:02] pieter_hintjens	if they take the effort to express them, that means there is real pain involved
[11:02] sustrik	i proposed a solution: let's make a debug version of 0mq
[11:02] pieter_hintjens	yes, that might work but
[11:02] pieter_hintjens	we know why it won't, in practice
[11:02] sustrik	#ifdef ZMQ_DEBUG
[11:02] sustrik	log (...)'
[11:02] sustrik	#endif
[11:02] sustrik	easy
[11:03] pieter_hintjens	Except that many people provide 0MQ to their own customers
[11:03] pieter_hintjens	Now they face the lovely choice of (a) debug version or (b) silent on failure version
[11:05] pieter_hintjens	ok, I have a proposal
[11:05] pieter_hintjens	i don't actually like the syslog solution at all, for developers
[11:06] pieter_hintjens	it's a pain to use and all it really does is collect output you can print to a log file
[11:06] pieter_hintjens	if you try to use that information in any way you hit Sustrik's Barrier of "it will not scale to the WAN"
[11:06] pieter_hintjens	ack?
[11:07] pieter_hintjens	it can work internally as a log collector
[11:07] pieter_hintjens	so leave it undocumented, and add a method zmq_verbose() that enables printing of syslog messages
[11:07] pieter_hintjens	or zmq_debug() or whatever
[11:08] sustrik	not bad
[11:08] sustrik	same as ZMQ_DEBUG but configurable at runtime
[11:08] pieter_hintjens	then when things go weird we can tell users, "run again with the --verbose switch and send me the screen dump"
[11:08] pieter_hintjens	yes, has to be runtime configurable
[11:08] sustrik	ack
[11:08] sustrik	that's a good solution
[11:08] pieter_hintjens	IMO there is no performance hit in doing this for all exceptional conditions
[11:08] pieter_hintjens	if we do see such a performance issue, we can solve it
[11:09] pieter_hintjens	if the syslog collector is embedded in zmq, it can't be abused and it requires no extra work except that one call
[11:14] Guthur	is there an optimum zmq_size size?
[11:14] pieter_hintjens	Guthur: to achieve what?
[11:14] pieter_hintjens	high throughput or low latency?
[11:15] Guthur	I was thinking of adding a stream interface to clrzmq2
[11:15] pieter_hintjens	what is a stream interface?
[11:15] Guthur	and so i was think it would send that as a series of msgs
[11:16] Guthur	just passing in a stream to send
[11:16] pieter_hintjens	no frames?
[11:16] Guthur	and then it would split it up into chunks and send
[11:16] pieter_hintjens	if it was video, for example
[11:17] pieter_hintjens	you'd want to send one frame per message to get best latency
[11:17] Guthur	I haven't thoroughly thought it through it yet, do you recommend framing
[11:17] pieter_hintjens	if it was file transfer, for example, you'd use large messages to maximize throughput
[11:17] Guthur	early idea genesis stage at the moment, hehe
[11:18] pieter_hintjens	then just choose a random value and optimize later when you actually know what the target is :-)
[11:18] pieter_hintjens	4096 bytes per message, there, I've decided for you
[11:18] Guthur	oh, ok
[11:18] Guthur	hehe, thanks
[11:19] mikko	Guthur: do you have tests you want to have run at some point?
[11:19] Guthur	mikko: not yet, I hope to sometime
[11:19] Guthur	a lot on my plate at the moment
[11:20] Guthur	really behind on my MSc thesis, unfortunately
[11:20] pieter_hintjens	"life is a buffet of interesting problems"
[11:24] mikko	"nothing is more expensive than hiring an amateur"
[11:24] mikko	saw that on twitter today
[11:24] pieter_hintjens	"If you think a professional is expensive, wait till you see the cost of hiring an amateur" was the version I saw
[11:24] pieter_hintjens	I think this would make a good motto for iMatix
[11:25] pieter_hintjens	I prefer your short version...
[11:26] mikko	http://twitter.com/cgbystrom/statuses/39607943171284992
[11:26] mikko	there it is
[11:26] pieter_hintjens	I need a Latin version
[11:27] Guthur	'If you think training a Graduate is expensive, try an Old Dog'
[11:28] Guthur	there's mine, hehe
[11:29] mikko	"If you think training a graduate is expensive, you are probably right but it might pay off later. Maybe"
[11:31] pieter_hintjens	I'd like to change the text "Explore the Community" on the main welcome page
[11:31] pieter_hintjens	it is too bland
[11:31] pieter_hintjens	It should be verb article noun
[11:33] pieter_hintjens	Escape the Box
[11:34] pieter_hintjens	that'll do...
[11:34] mikko	reminds too much of thinking out of the box
[11:34] pieter_hintjens	that was the intention
[11:35] mikko	oh..
[12:17] pieter_hintjens	sustrik: how can I tell if a specific commit got into the 2.0.10 release?
[12:17] pieter_hintjens	e2167cecaefec6557c7a5712fb75e51487ff69a6 is the one I'm interested in
[12:19] pieter_hintjens	ok, it's not there... np
[12:46] pieter_hintjens	sustrik: I've found another bug in master
[12:46] pieter_hintjens	am porting all the Guide examples to 2.1, some of them do quite strange stuff
[12:47] pieter_hintjens	Have logged this https://github.com/zeromq/zeromq2/issues/167 (and #168 was the one I found earlier)
[12:50] pieter_hintjens	lunch, brb
[12:55] sustrik	pieter_hintjens: ok
[12:55] sustrik	btw "fathom the basics" is not good
[12:55] sustrik	it's a rarely used word
[12:55] sustrik	and most non-native speakers won't understand it
[12:56] sustrik	same with "grab"
[12:57] sustrik	also, nobody will understand "escape the box" leads to the community page
[14:32] pieter_hintjens	sustrik: sure, but what does "the community" mean either?
[14:32] sustrik	dunno :)
[14:32] pieter_hintjens	exactly, it only makes sense when you already know what it is...
[14:33] sustrik	development?
[14:33] pieter_hintjens	... something that says, "Get a lot more stuff here..."
[14:33] sustrik	yes
[14:33] pieter_hintjens	Get the Addons
[14:33] pieter_hintjens	ugh
[14:33] mikko	addon sounds proprietary
[14:33] sustrik	or simple "more"
[14:34] pieter_hintjens	from a design basis it's Verb article Noun
[14:34] pieter_hintjens	minimalism is quite difficult... :)
[14:34] sustrik	true
[14:35] sustrik	proceed to more
[14:35] sustrik	check more stuff
[14:35] pieter_hintjens	Enter the Deep End
[14:35] pieter_hintjens	Dive into the Boiling Tarpits of Git
[14:36] sustrik	something like that
[14:36] pieter_hintjens	anyhow it doesn't really matter if people understand what each option means...
[14:36] pieter_hintjens	there are 5 options, clickety-click...
[14:36] pieter_hintjens	all we want is their curiosity
[14:37] sustrik	we can try to leave it as it
[14:37] pieter_hintjens	no-one's going to hesitate thinking, "Oh, I don't fully understand that, therefore I can't click"
[14:37] sustrik	what i find really wrong is "fathom"
[14:37] pieter_hintjens	sure
[14:37] pieter_hintjens	I wanted a little more spice
[14:37] pieter_hintjens	Learn the Basics
[14:37] pieter_hintjens	but Learn sounds like school
[14:38] pieter_hintjens	and "Read the Manual" just sounds like work
[14:38] sustrik	read the manual is neutral imo
[14:38] pieter_hintjens	hmm, basics vs. advancedics
[14:38] sustrik	what about community -> "dive into details"
[14:39] pieter_hintjens	I like it
[14:39] pieter_hintjens	that contrasts with Basics nicely
[14:40] pieter_hintjens	aight, we have a winner, thanks Martin :-)
[14:44] jobytaffey	I'm working on a server which multiplexes many TCP sockets into 0MQ. I want a worker app to be able to talk bi-directionally to any TCP socket. What's an efficient way to approach this? One 0MQ socket per TCP socket? or perhaps a single 0MQ socket per direction, aggregating all TCP messages (filtered with ZMQ_SUBSCRIBE)?
[14:45] pieter_hintjens	jobytaffey: have you read the Guide?
[14:46] jobytaffey	Bits of. I guess I'll go and read some more then...
[14:46] pieter_hintjens	this is not actually covered but it will help you understand how to make the 0MQ part
[14:47] pieter_hintjens	the answer depends a lot on what kind of work you are doing with the TCP sockets
[14:47] pieter_hintjens	i.e. how you map the TCP side to one or more 0MQ patterns (pubsub, req-rep, pipeline)
[14:49] jobytaffey	I've got fixed sized packets going in both directions over TCP. I have an online wireless sensor network where I want to be able to register for telemetry feeds from the devices. But, given millions of devices, I don't think I can afford a 0MQ socket per device.
[14:50] pieter_hintjens	jobytaffey: please read the Guide, at least Ch1 in detail, and come back when you can map your problem to 0MQ patterns
[14:50] pieter_hintjens	otherwise you lack the tools to design this properly
[14:51] pieter_hintjens	it sounds like pubsub except you say it's bidirectional
[14:51] jobytaffey	I agree I'm trying to run before walking, thanks.
[15:00] fbarriga	I don't speak english very good but after a while I could decipher the expresion "Dive into the Boiling Tarpits of Git" but "fathom" is quite weird
[15:01] fbarriga	reading the guide is quite amusing but all the decoration make it a bit longer to finish.
[15:02] neale1	fathom == a unit of depth (6 feet ~ 2m). The expression is used to indicate that you getting "deeper" into the subject matter.
[15:03] pieter_hintjens	fbarriga: ah, but you did finish it ... :-)
[15:03] pieter_hintjens	we had some arguments at the start, about whether or not to decorate the text
[15:03] pieter_hintjens	some people complained about it, but most people enjoyed it, so that's the style I used
[15:04] fbarriga	pieter_hintjens, nop, I was working on other stuff :(
[15:04] pieter_hintjens	for those who prefer their text dry we have the man pages, I guess
[15:06] fbarriga	is more 'artistic' the way you write the guide. I think that it sound more intellectual rather than technical.
[15:13] pieter_hintjens	fbarriga: well, to be honest, this is how I write when no-one is telling me to conform
[15:13] pieter_hintjens	s/telling/paying/
[15:24] fbarriga	that's good
[15:49] cremes	is this callstack from 0mq sending or receiving a message? https://gist.github.com/837241
[15:50] neale1	anyone had any experience using 0MQ with Spring?
[15:54] pieter_hintjens	cremes: it looks like it's in the zmq_msg_init_size method
[15:55] pieter_hintjens	neale1: not as far as I know of, there are some people talking about it but nothing concrete yet
[15:56] neale1	Tks. I have a client trying to use it and having problems. (That's as about as detailed as they have described it unfortunately)
[15:57] cremes	pieter_hintjens: right; but i'm wondering if this is 0mq in the act of sending or receiving a message because this isn't code i am calling directly
[15:57] cremes	and i'm showing that i have a small memory leak here on occasion
[15:58] pieter_hintjens	cremes: hmm, sustrik would know for sure but it looks like an input event, so receiving a message
[15:58] pieter_hintjens	neale1: you obviously need more detail on what the problems are...
[15:59] cremes	yeah, i'm thinking this is 0mq allocating its own msg_t for receiving the message envelope/header/routing information for an req/rep pair
[16:03] sustrik	cremes: receiving
[16:04] cremes	sustrik: any chance the lib is not calling close on the message envelope (which isn't passed to the user)?
[16:04] sustrik	cremes: how do you know there's a memory leak?
[16:04] sustrik	sure, there can be a bug
[16:04] sustrik	can you be more specific about the leak?
[16:04] cremes	on osx there is a program called 'leaks' that can read the heap and use a malloc flag to find leaks
[16:04] cremes	yes
[16:05] sustrik	ok, so it maps malloc to frees
[16:05] sustrik	and once the program exits
[16:05] sustrik	it'll print out the leaks
[16:05] cremes	'leaks' is outputting that i am leaking 112 bytes periodically
[16:05] sustrik	right?
[16:05] cremes	nope, it analyses a live heap
[16:06] sustrik	how can it possibly know the chunk is not referenced anymore?
[16:07] pieter_hintjens	it's osx... apple... magic... :)
[16:08] cremes	sustrik: login to my box with your account and run 'man leaks'; read the first paragraph
[16:08] sustrik	let me see...
[16:08] cremes	also, reload this gist to see the output from leaks for my program: https://gist.github.com/837241
[16:10] cremes	the receiving socket in this case is an xreq
[16:11] sustrik	ok, it scans whole memory for the pointers
[16:12] cremes	yes; if you keep reading, you'll see the note about MallocStackLogging for getting the stack where the leaked memory was allocated
[16:12] cremes	s/stack/callstack/
[16:13] sustrik	the stack i see in the output it
[16:13] sustrik	is
[16:13] sustrik	thread_start \| _pthread_start \| thread_routine \| zmq::kqueue_t::loop() \| zmq::zmq_engine_t::in_event()
[16:13] sustrik	that's different from the one you've posted before
[16:13] sustrik	is it truncated or what?
[16:14] cremes	you need to scroll the gist to the right; for some reason it doesn't wrap this output
[16:14] sustrik	ah
[16:16] sustrik	well, there are 3 possibilities:
[16:16] sustrik	1. bug in 0mq
[16:16] sustrik	2. client program not closing the message
[16:16] sustrik	3. the pointer is actually stored outside of process memory
[16:17] cremes	i don't think it's 2 because i loop over all messages and call zmq_close() on each part when i'm done
[16:17] cremes	there's no way for my logic to skip that step
[16:18] cremes	though it's certainly possible my code is the cause
[16:18] sustrik	ok
[16:18] sustrik	then it's either 1 or 3
[16:18] sustrik	3 happens when pointer to memory is actually held inside of socketpair buffer
[16:18] sustrik	which resides in kernel space
[16:19] sustrik	at least i would assume it does, even on osx
[16:19] cremes	i am only seeing this on the one client program i have using an xreq socket
[16:20] cremes	everything else is using req,rep,pub,sub and i don't see the leak
[16:20] cremes	so i am wondering if this is specific to xreq
[16:20] sustrik	ack
[16:20] cremes	you can see from the 'context' hex that it contains my custom identity so i assume this is routing info
[16:21] cremes	which xreq is supposed to cut off before passing the messages to me
[16:21] sustrik	is it a head version?
[16:21] sustrik	or 2.1.0 or 2.0.x?
[16:22] cremes	hang on, let me get the git commit hash (oh, 2.1.0)
[16:22] sustrik	ok
[16:22] cremes	e94790006ea6f4c64c commit from feb 9
[16:22] cremes	i'll update to latest & greatest
[16:25] neale1	pieter_hintjens: Yeah I was just going to see if anyone had done it so (a) I know it could be done and (b) they might have a "recipe" for doing it and I could pass that wisdom on
[16:25] sustrik	cremes: the application in questio; is it using req or xreq socket?
[16:25] cremes	xreq
[16:26] sustrik	ok, so you handle individual message parts by hand
[16:26] sustrik	are you closing all of them?
[16:26] sustrik	identity/delimiter/body?
[16:27] cremes	yes
[16:27] cremes	btw, i see this with the latest master too
[16:28] cremes	all message parts are passed to my read callback
[16:28] cremes	and i call: messages.each { \|message\| message.close }
[16:29] cremes	it's iterating over the array of message parts calling close on each (zmq_close is called inside of the close() instance method)
[16:29] sustrik	what language is that?
[16:29] cremes	does xreq pass the identity and delimiter messages up the stack?
[16:29] cremes	ruby
[16:29] sustrik	cremes: yes
[16:29] cremes	i thought only xrep saw all that detail
[16:30] sustrik	same with xreq
[16:31] sustrik	are there connections being created and torn down during the test?
[16:33] cremes	yes, these connections can and do go away (socket is closed via zmq_close())
[16:33] cremes	btw, i am printing everything this callback is receiving to my log
[16:34] cremes	it does get the nul delimiter message but it does not get the identity message before it
[16:34] sustrik	right the identity is used and stripped off by the xrep socket on the other side of the connection
[16:34] cremes	that confirms what i originally thought; xreq strips that off
[16:34] sustrik	xrep
[16:35] sustrik	so the identity is passed to the xreq only on a single occassion:
[16:35] sustrik	on connection initiation
[16:35] sustrik	let me check the corresponding code...
[16:36] cremes	i thought it worked opposite; xrep gets identity from xreq during connection
[16:36] cremes	xreq doesn't actually send its identity after that; xrep prepends it locally
[16:37] cremes	so when sending from xreq, it goes nul delimiter + message parts
[16:37] cremes	xrep recvs routing info (if any intervening hops) and pushes its peers identity on the top of that stack + delimiter + message parts
[16:38] cremes	then when it replies, it just sends that routing info stack + nul + parts
[16:38] cremes	and ultimately xreq (at original source) recvs just nul delimiter + parts because each intervening xreq stripped off one level of routing info
[16:38] sustrik	except of the first part of the route stack
[16:39] sustrik	which it have already used to route the message back
[16:39] sustrik	to the xreq
[16:40] cremes	why would the last hop need to send that routing info to the originator? that socket already "knows" their own identity!
[16:40] sustrik	that's why it stips it off
[16:40] sustrik	i short
[16:40] sustrik	xreq doesn't mess with routing info at all
[16:41] sustrik	xrep adds one peers identity to the stack on message receival
[16:41] sustrik	and strips one identity from the stack on send
[16:41] pieter_hintjens	s/xreq/dealer/g, s/xrep/router/g, it'll be much easier
[16:41] sustrik	i think i see the leak
[16:41] pieter_hintjens	dealers are just like push + pull
[16:42] cremes	right, then i think we agree; the last xrep to reply to the original xreq will strip off its identity, so it only sends the delimiter + parts
[16:44] sustrik	cremes: i've pasted the patch via irc to you directly
[16:45] sustrik	let me know whether it helps
[16:45] pieter_hintjens	sustrik: thanks for your help with those two issues btw
[16:45] sustrik	np
[16:45] pieter_hintjens	i have a third one which I think actually might be a 0MQ issue ... :-)
[16:45] pieter_hintjens	#169
[16:46] sustrik	:)
[16:46] pieter_hintjens	not sure what the semantics are for zmq_term and pubsub
[16:46] pieter_hintjens	if there are 10 connected subscribers, should they all get the last message?
[16:47] pieter_hintjens	assuming publisher sends and then terminates
[16:47] sustrik	aren't you just running into async connect issue?
[16:47] cremes	sustrik: success!
[16:47] pieter_hintjens	it's a synchronized pubsub example
[16:48] sustrik	cremes: great
[16:48] sustrik	let me apply the patch
[16:48] pieter_hintjens	subscribers explicitly tell publisher when they are present
[16:48] cremes	sustrik: if you want to do an occasional 'leaks' check on osx, feel free
[16:48] sustrik	ack
[16:49] cremes	it's really easy... # MallocStackLogging=1 ./my_program
[16:49] cremes	leaks <pid of my_program>
[16:49] pieter_hintjens	sustrik: I'll test if it's due to async connects... hang on...
[16:50] CIA-21	zeromq2: 03Martin Sustrik 07master * r0eea935 10/ src/zmq_init.cpp :
[16:50] CIA-21	zeromq2: Fix for memory leak caused by long identities
[16:50] CIA-21	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/hrfcjQ
[16:50] sustrik	cremes: done
[16:50] sustrik	and thanks for the offer
[16:51] sustrik	cremes: btw, it was you who said that SO_SNDBUF/SO_RCVBUF on OSX is measured in kB rather than bytes, right?
[16:51] cremes	sustrik: yeah, that's the way it looks to me but i can't find that documented anywhere
[16:51] cremes	it's screwy
[16:52] sustrik	the interesting part is that you've mentioned that getsockopt(SNDBUF) returns 0
[16:52] sustrik	so it's not obvious how to even find out
[16:53] cremes	ack
[16:53] cremes	if you want to write a small c program that exercises that stuff, that's probably the best way to know "for sure"
[16:53] sustrik	what kind of kernel is that btw
[16:53] sustrik	proprietaty?
[16:53] cremes	i could also ask about it on apple's dev lists
[16:53] cremes	nope, it's open source
[16:54] cremes	it's called darwin... it's a mach + freebsd hybrid
[16:54] sustrik	then try asking on the list
[16:54] cremes	ok
[16:54] sustrik	the functionality is obviously misbehaving
[16:54] sustrik	so it would be nice to know what the devs have to say about it
[16:54] cremes	any chance you could provide a small c program that illustrates the issue?
[16:54] cremes	code always talks louder
[16:54] sustrik	i can try
[16:54] sustrik	wait a sec
[16:54] cremes	especially when they can repro it :)
[16:55] cremes	maybe get/setsockopt only screw up on socketpairs
[16:55] sustrik	yes, that's my thinking as well
[16:56] sustrik	btw, POSIX doesn't specify the unit :_
[16:56] sustrik	:)
[16:56] sustrik	it just says "buffer size"
[16:56] cremes	heh
[16:56] cremes	yeah, i can't imagine get/setsockopt are broken for all sockets... that would be very obvious
[16:56] sustrik	Same with Stevens' book
[16:57] sustrik	anyway, let me write an example
[16:57] cremes	k
[17:02] pieter_hintjens	sustrik: you were right...!
[17:02] pieter_hintjens	0MQ is just too fast, all the 1M messages get broadcast even before the clients connect...
[17:03] sustrik	heh
[17:03] pieter_hintjens	it makes it quite a challenge to synchronize subscribers and publishers then...
[17:03] pieter_hintjens	not sure the worked example is even valid
[17:06] sustrik	i am not sure that's it even possible
[17:06] sustrik	it's like synchronising a radio show
[17:06] sustrik	with all listeners
[17:07] sustrik	there still me a letter from a listener in amazonia asking for postponing the show
[17:07] sustrik	stuck somewhere in post office at manaos
[17:08] sustrik	if you know the number of listeners in advance, it can be solvable
[17:13] sustrik	cremes: i've created a socket pair on OSX
[17:13] cremes	ok
[17:13] sustrik	tried to getsockopt the SNDBUF and RCVBUF
[17:13] sustrik	both result in 3,000,000
[17:13] sustrik	where have you seen it returning a zero?
[17:14] cremes	in mailbox.cpp ::send, that buffer expansion code will return 0 or 32 depending
[17:14] cremes	zmq::mailbox_t::send around line 160
[17:15] sustrik	strange
[17:15] cremes	line 170 usually fails to return a sane number
[17:15] sustrik	meybe setsockopt uses different units than getsockopt?
[17:15] sustrik	let me try
[17:16] cremes	btw, that 3 million number is bytes and i set it in my /etc/sysctl.conf if you want to look at that
[17:16] sustrik	that seems to be ok
[17:16] cremes	that's the sysctl for local communications which is what i guess socketpair uses
[17:16] sustrik	no need checking
[17:16] sustrik	the problem is somewhere further down the way
[17:21] sustrik	resizing on osx works better on osx than on linux :)
[17:21] sustrik	i resize to 100,000
[17:21] sustrik	i check the size
[17:21] sustrik	i get 100000
[17:21] sustrik	when i do same with linux i get
[17:21] sustrik	200000
[17:22] cremes	heh
[17:22] cremes	try running that stress test... that usually blows it up
[17:27] pieter_hintjens	sustrik: something to think about for the future, if we can make synchronous connects
[17:27] pieter_hintjens	it'd add real value IMO
[17:27] pieter_hintjens	for certain use cases at least
[17:28] cremes	pieter_hintjens: that could be provided by an add-on/wrapper library
[17:28] pieter_hintjens	nope, not afaics
[17:28] pieter_hintjens	i'm hitting this problem in one of the examples
[17:28] cremes	why, because connection status isn't exposed?
[17:28] pieter_hintjens	i use a req/rep dialog to synchronize the two peers, then a pubsub dialog for the data
[17:29] cremes	sounds like ftp
[17:29] pieter_hintjens	but I can't get the two synchronized
[17:29] pieter_hintjens	because the pubsub connect can take any arbitrary time
[17:29] cremes	hmmm
[17:29] pieter_hintjens	so even if the req/rep dialog says 'ready', that doesn't mean the subscriber will get data
[17:29] cremes	right
[17:29] pieter_hintjens	the only sure way is that the pubsub dialog explicitly confirms the connection, if over a connected transport
[17:30] cremes	so, you bind the SUB socket, the req/rep says ready, you connect the PUB and it fails?
[17:30] pieter_hintjens	yup
[17:30] cremes	sounds like a bug
[17:30] pieter_hintjens	doesn't fail, just doesn't connect in time to get any data
[17:30] cremes	wait, which one doesn't connect in time to get data?
[17:30] pieter_hintjens	It's this one: https://github.com/zeromq/zeromq2/issues/169
[17:31] pieter_hintjens	i'd like a handshake between publisher & subscriber at connect time, which is exposed to the app
[17:32] pieter_hintjens	so subscriber sends identity, and publisher acknowledges, and app code can wait for that to complete
[17:32] pieter_hintjens	optionally
[17:32] cremes	pieter_hintjens: this is why i use devices so much
[17:32] cremes	put a forwarder in the middle and it will probably work
[17:32] cremes	the pub connects to the forwarder as do the subs
[17:32] pieter_hintjens	i'll have the same issue from forwarder to subscribers
[17:32] cremes	when your req/rep gives the all-clear, let 'er rip
[17:33] cremes	well, if zmq_connect() takes an arbitrary amount of time, that seems like a 'broken' contract to me
[17:33] pieter_hintjens	it's all asynchronous, adding more steps may introduce enough delay, but it's not certain
[17:33] cremes	right, i see
[17:34] pieter_hintjens	sustrik: would you consider a handshake for new connections?
[17:34] pieter_hintjens	i.e. C: identity frame, S: identity ack
[19:16] sustrik	cremes: still there?
[19:16] cremes	sustrik: yes
[19:16] sustrik	git doesn't seem to be installed on your box
[19:16] sustrik	how do you get the sources?
[19:16] cremes	add /opt/local/bin to your $PATH
[19:17] sustrik	ok
[19:17] sustrik	works!
[19:17] sustrik	thanks
[19:17] cremes	you're welcome
[19:19] sustrik	btw, have you seen the shutdown stress to fail with head?
[19:20] sustrik	doesn't fail for me
[19:26] cremes	sustrik: i haven't seen it fail since i boosted my buffers to 3MB (that 3 million number you saw earlier)
[19:26] cremes	if i move back to the defaults, i'm fairly certain it will fail
[19:26] sustrik	aha
[19:26] sustrik	can you do that?
[19:26] cremes	not at the moment... i'm wrapping up some work
[19:26] cremes	i can do that tomorrow
[19:27] sustrik	ok, np
[19:27] sustrik	just ping me then
[19:28] ljackson	did I read the documentation correctly that in a multithreaded app the zmq context is common to all sockets bind and connect ?
[19:28] sustrik	yes
[19:29] ljackson	k, if you have a embeded queue device with workers in threads connecting to inproc
[19:29] pieter_hintjens	ljackson: only necessarily if you're using inproc
[19:30] ljackson	can you with the same context in another thread add stuff to the queue via another inproc ?
[19:30] ljackson	pieter_hintjens, good to know.
[19:30] pieter_hintjens	ljackson: the semantic for communicating between threads is 'send a message'
[19:30] pieter_hintjens	you can send a message to the queue device from any thread, obviously
[19:30] ljackson	what I am seeing is zmq hanging trying to connect/bind must be something i messed up then
[19:31] ljackson	basically as a test tring to take mtserver.cpp and put the client internal all on inproc using same context is this even valid ?
[19:31] ljackson	obviously
[19:31] ljackson	diffrent inproc:// addresses
[19:31] ljackson	workers for threads as example has and then using queue for clients
[19:31] pieter_hintjens	it should work
[19:32] ljackson	humm
[19:32] ljackson	ok will keep digging or write example test code if I can get it to work and ask again here
[19:32] ljackson	thx
[19:32] pieter_hintjens	if your code's short, post it to a gist so we can look at it
[19:33] ljackson	yeah might have to write an example code for my own sanity anyway if I can reproduce my issue I will post and ask again here
[19:34] ljackson	i also believe I read that the order of connection bind vs connect doesn't matter? Or is that in only certain socket types ?
[19:35] ljackson	worried that the device thread is not binding before the workers connect
[19:35] ljackson	...etc.
[19:35] pieter_hintjens	for inproc it matters
[19:35] pieter_hintjens	you must absolutely bind, then connect
[19:35] pieter_hintjens	let me send you a new version of mtrelay that shows how I do this in 2.1
[19:35] pieter_hintjens	it's somewhat changed
[19:36] ljackson	nice thx, I am using 2.1.1 from git
[19:36] pieter_hintjens	https://gist.github.com/837575
[19:36] pieter_hintjens	the trick is to bind and connect a socket pair in the parent thread, then pass the context & socket to the child thread
[19:37] pieter_hintjens	i will try to make a simple abstraction for this, it's a very common pattern
[19:37] pieter_hintjens	specifically for inproc multithreading that is never intended to be scaled out
[19:39] ljackson	humm
[19:39] ljackson	i thought you were never to share/send the socket ?
[19:42] pieter_hintjens	in 2.1 this is legal, and extremely useful for inproc/multithreading
[19:42] pieter_hintjens	not shared, just sent
[19:49] ljackson	ahh i get it so you know for sure that it was done in the correct order
[19:50] ljackson	then you send on the pointer and forget you knew about it
[19:50] ljackson	k
[19:51] ljackson	pieter_hintjens, so you need to bind a new socket for each worker thread then
[19:51] ljackson	in REQ/RES...etc.
[19:52] pieter_hintjens	yes
[19:55] amacleod	What 0MQ pattern is best for an asynchronous dialogue, where 2 participants can send messages but there doesn't need to be a 1:1 request/response correspondence?
[19:55] amacleod	Can XREQ/XREP do that?
[19:56] ljackson	pieter_hintjens, thread_args_t child; child->socket = new zmq::socket_t(context, ZMQ_ZXREP); ?
[19:57] cremes	amacleod: yes, those sockets are the perfect choice
[19:57] pieter_hintjens	ljackson: what's the question?
[19:57] amacleod	Okay, cool. My next step is to understand how to use identity addressing, then.
[19:57] pieter_hintjens	amacleod: have you read the Guide yet?
[19:57] ljackson	c++ vs your example
[19:57] amacleod	pieter_hintjens, parts of it.
[19:58] pieter_hintjens	ljackson: you're asking whether the C++ is correct?
[19:58] ljackson	as in the new pointer .. yeah nevermind answered my own Q
[20:05] amacleod	So, I noticed when using the Java bindings, I cannot seem to interrupt a recv operation that's trying to read from a socket whose other endpoint doesn't exist.
[20:05] amacleod	If I want to preserve Java interruptability, should I be using NOBLOCK and a poller?
[20:16] fbarriga	I have a little problem with python
[20:16] fbarriga	I can't receive an structure from C++
[20:16] amacleod	fbarriga, what format are you using to serialize structures?
[20:17] fbarriga	I don't know why It tries to receive it like a string
[20:17] fbarriga	in C++ raw structure
[20:17] fbarriga	in python struct.unpack
[20:17] fbarriga	in the reverse way it works
[20:17] fbarriga	msg = self.socket.recv()
[20:18] fbarriga	len (msg) it prints 29
[20:18] fbarriga	and I'm sending only a double
[20:19] fbarriga	double nav = 1; memcpy(zmq_msg_data(&msg), &nav, sizeof(double));
[20:19] amacleod	Can you do a byte-by-byte comparison of what you're getting on the wire and the result of struct.pack with the value you expect?
[20:21] amacleod	On the C++ side you are telling zmq that your packet should be (sizeof double) bytes long?
[20:22] ljackson	pieterh, acording to what I read in the docs/guide am i correct in that the PUSH/PULL socket can have multiple pushers and a single puller in the same context e.g. workers and sinks ?
[20:23] pieterh	sure
[20:23] fbarriga	sorry guys I found it
[20:23] ljackson	pieterh, just wanted to make sure I read it right thx :)
[20:23] pieterh	ljackson: all socket types except PAIR can be connected 1-N or N-1
[20:23] ljackson	or N-N ?
[20:23] fbarriga	quite stupid my error, I've 2 streams and I was connecting to the wrong one :(
[20:24] pieterh	that's just 1-N and N-1 from two sides
[20:24] ljackson	ya
[20:25] ljackson	inproc, pull push need the same treatment as req/res ..etc ?
[20:28] pieterh	ljackson: rtfm here http://zguide.zeromq.org/chapter:all#toc21
[20:28] pieterh	i'm adding more detail on the 2.1 mtrelay example but the point is that inproc is not a disconnected transport
[20:34] ljackson	k thx
[20:52] amacleod	In Java bindings, how do I determine what error happened when recv returns null?
[20:58] CIA-21	jzmq: 03Gonzalo Diethelm 07master * r91da678 10/ (5 files in 2 dirs):
[20:58] CIA-21	jzmq: Use zmq_errno() everywhere instead of errno.
[20:58] CIA-21	jzmq: Set all projects to compile with a Release configuration. - http://bit.ly/h8ZIKL
[21:05] gdan	trying to build in ubuntu 10.10: installed: g++,g++ 4.5,gcc-opt, libstdc++6, libstdc++645-dev, *** I am getting what appers to me stl errors, does anyone have a list of reuired libaries?
[21:06] gdan	building the libzmq
[21:09] pieterh	gdan: I'm searching for the answer...
[21:09] gdan	thanks
[21:10] pieterh	gdan: do you have build-essential?
[21:11] gdan	let me check
[21:12] gdan	do not see it, nor do i see it avail in software centr
[21:13] pieterh	apt-get install build-essential
[21:13] pieterh	sudo apt-get install build-essential, to be accurate
[21:13] pieterh	also uuid-dev
[21:14] gdan	installing...
[21:14] soren	Or: sudo apt-get build-dep libzmq0
[21:14] soren	That installs all the build-dependencies of the libzmq0 ubuntu package.
[21:14] soren	Handy shortcut.
[21:15] gdan	will do...
[21:19] gdan	i re-ran configure, now make is running. so far, looks good. thank you
[21:24] pieterh	gdan: np
[21:28] ljackson	pieterh, odd appears i am close, getting context terminated exception...
[21:29] pieterh	main thread terminates the context before other threads are finished
[21:31] ljackson	humm
[21:32] ljackson	this happens starting up the ZMQ_QUEUE device. even before I get to starting worker threads
[21:32] ljackson	QUEUE bridging inproc not tested ?
[21:38] pieterh	ljackson: hard to say without code to look at
[21:39] pieterh	the queue device is just normal code
[21:39] pieterh	find where you're doing zmq_term, print "HELLO" and then check whether that shows before the error...
[21:40] pieterh	if you terminate the context, that kills all inproc sockets
[21:42] ljackson	yeah not doing any terminate, but a few other things with this existing code... prob need to make test code to try to reproduce
[21:49] ljackson	doh I think i see it
[21:51] amacleod	Where are the URIs for 0MQ documented? I assumed I should be using tcp://host:port even for my XREQ/XREP dialogue, but the "mamas/papas" examples in the guide use things like "ipc://routing.ipc".
[21:53] amacleod	Found it. zmq_ipc(7) man page.
[21:58] pieterh	amacleod: zmq_bind / zmq_connect man pages list the transports, each has its own man page
[21:59] amacleod	Ok, thanks. So I still want tcp since I want all my sockets to be remote.
[22:00] amacleod	I'm still having trouble wrapping my head around the addressing requirements for XREQ/XREP.
[22:01] amacleod	So far I have an XREP "server" that listens for connections. I'm able to create an XREQ "client" that establishes a connection and sends a request.
[22:01] amacleod	The server sees the request and where it came from (2 message parts: address, payload), and is able to reply.
[22:02] amacleod	When I send the reply as (address, null, payload), I can see in wireshark that some data gets sent in the opposite direction on the socket, but my recv call in the client never returns.
[22:02] pieterh	amacleod: ... some generalities about the request-reply pattern
[22:02] pieterh	in most cases, client is REQ and server is REP
[22:03] pieterh	and anything in the middle (e.g. queue device) is XREP--XREQ
[22:03] pieterh	that's the basic layout
[22:03] pieterh	after that there is the weird stuff
[22:03] pieterh	which chapter 3 explains
[22:03] amacleod	So to do this "asynchronous dialogue" thing, do I need to have more than 2 participants?
[22:04] pieterh	asynchronous means what exactly?
[22:04] pieterh	in your use case, I mean
[22:04] amacleod	Not lockstep on requests and replies.
[22:04] pieterh	so how do the messages flow?
[22:04] pieterh	do you have N clients, 1 server?
[22:04] pieterh	N clients, N servers?
[22:04] amacleod	Yes.. N clients, 1 server.
[22:04] amacleod	Bi-directional messages.
[22:05] pieterh	can servers send messages to clients who have not sent a message to the server?
[22:05] pieterh	or is it 1 request / 0-N responses?
[22:05] amacleod	No. There has to be at least an initial handshake.
[22:05] amacleod	It's 1-N requests, 0-N responses.
[22:05] amacleod	So, the client can still send messages to the server even while the server is sending responses.
[22:05] pieterh	can clients pipeline requests?
[22:05] pieterh	ah, ok
[22:06] pieterh	so your server definitely has to be XREP (which we will rename to ROUTER at some stage)
[22:06] amacleod	Okay. That's the way I have it.
[22:06] pieterh	your clients always initiate the dialog, and they talk to a single server only
[22:07] amacleod	When I get messages in on the server, I add the identity to a routing table in the server object.
[22:07] gdan	does anyone know in mono, how in mono (on ubuntu) i reference the libzmq runtime libraries i just built in the project? ( do see that they are in usr/local/lib)
[22:07] pieterh	so they do indeed use XREQ
[22:07] amacleod	pieterh, that is correct.
[22:07] pieterh	if your clients talked to N servers, you'd want to use XREP for them as well
[22:07] amacleod	gdan, does mono respect LD_LIBRARY_PATH?
[22:08] gdan	don't know. perhaps when i run i specify the path?
[22:08] amacleod	pieterh, indeed. One thing I didn't realize at first was that, with XREQ, I did not need to explicitly send the client's identity.
[22:08] amacleod	gdan, I know in Java, I have to do something like -Djava.library.path=path/to/jzmq_native. Probably mono has something similar.
[22:08] pieterh	amacleod: if you use either XREP or XREQ in your code, you absolutely have to study and learn how these sockets create and use envelopes
[22:09] pieterh	it's logical once you know it but it is really not obvious
[22:09] amacleod	pieterh, I have been trying to understand a bit by looking at wireshark.
[22:09] gdan	i'll look
[22:09] pieterh	it's explained in Ch3, look for Request-Reply envelopes
[22:10] amacleod	One thing I was confused about in chapter 3 was how XREP and XREQ expect their envelopes. Is it true that XREQ expects envelopes the same way REQ does and that XREP expects envelopes the same way REP does?
[22:10] pieterh	no
[22:10] pieterh	the name XREQ is highly misleading
[22:10] pieterh	it should and will be called "DEALER"
[22:10] pieterh	it is exactly equivalent to a PUSH+PULL combination
[22:11] pieterh	it deals messages out, and in, without changing them
[22:11] pieterh	whereas REP is a terminator that rips open the envelope, hides it, gives you the contents, and sneakily recreates the envelope when you send a reply
[22:11] pieterh	they are fundamentally different tools
[22:12] pieterh	take a look at the rtdealer example
[22:12] amacleod	Ok. What I was thinking a while ago is that I could achieve what I wanted with a pair of REQ/REP sockets, one for each direction. The replies would always be dinky little "yep I got it" acknowledgements.
[22:13] pieterh	it will work really easily using XREQ/XREP, don't panic
[22:13] amacleod	But I couldn't think of a good way to create both at once. (trying not to panic :-D)
[22:13] pieterh	- your clients using XREQ just connect, and send simple messages
[22:14] pieterh	- the server, when it receives a message, gives the app the client identity, followed by the simple message, in two parts
[22:14] amacleod	So they are similar to the "worker" in rtdealer? (I say this because worker uses XREQ).
[22:14] pieterh	- the server, when it wants to talk to client A, sends two parts: identity A, and then simple message
[22:14] pieterh	in rtdealer, client is the worker (uses XREQ as you see), server is the main thread
[22:15] pieterh	would it help to have a real example of a server and clients working like this?
[22:15] pieterh	i'll make one tomorrow, it's late here now
[22:15] amacleod	Possibly. I might try translating rtdealer into Java after I get home.
[22:16] pieterh	it's not literally what you want, but it shows how to use that socket pair
[22:17] amacleod	I thought that I had things pretty much okay as far as you have said. When you say "the server sends two parts: identity A, and then simple message", do you mean that XREP(router) does that for me when I call send, or that my code should do that using sendmore followed by send?
[22:17] pieterh	sendmore followed by send
[22:18] pieterh	you need to explicitly tell the router socket who to send the message to
[22:18] pieterh	it does not actually send the identity
[22:18] pieterh	it uses that to decide what client to talk to, then sends the remaining message part(s)
[22:18] amacleod	Ok. It's possible there is a bug in my client receiving code. I did see the reply come back on the socket in wireshark.
[22:19] pieterh	I think your use case is much better than the one I used for XREP-XREQ
[22:20] pieterh	so I'll change the Guide for that...
[22:20] amacleod	It's for instant-message style chat. (wrapping XMPP).
[22:20] pieterh	indeed
[22:20] pieterh	tomorrow sometime I'll have a working example
[22:21] amacleod	Ok. I'll stop pestering you so you can sleep :-) If I am able to translate the current example into Java, I'll let you know here tomorrow.
[22:23] pieterh	np :-)
[22:41] CIA-21	zeromq2: 03Mikko Koppanen 07master * r98ccff1 10/ builds/redhat/zeromq.spec :
[22:41] CIA-21	zeromq2: Fixes build on at least CentOS 5
[22:41] CIA-21	zeromq2: Signed-off-by: Mikko Koppanen <mikko.koppanen@gmail.com> - http://bit.ly/hHoTEo
[23:39] rem7	what is the best way to ensure the delivery of a msg in a PUSH/PULL (something very similar to the ventilator tutorial) ... I sent out about 16Million msgs and my sink recvd 10million.
[23:41] rem7	I was reading that multicast only works on PUB/SUB...?
[23:50] mikko	sustrik: thanks!