ZeroMq IRC Log

Monday October 11, 2010

[Time] Name	Message
[06:50] sustrik	petrilli: can you spell your problems with java binding more explicitly
[06:50] sustrik	?
[06:50] sustrik	having a list of issues could make it move forward faster
[08:01] mikko	good morning
[08:13] sustrik	morning
[08:24] mikko	Assertion failed: term_acks > 0 (own.cpp:175)
[08:24] mikko	this random assertion keeps popping up
[08:24] mikko	let me make sure that i got the latest master
[08:35] mikko	sustrik: at the moment on master: the context close will block even if the sockets are closed ?
[08:36] mikko	assuming there are messages in-flight waiting to be sent
[08:55] mikko	hmm
[09:08] sustrik	mikko: yes
[09:09] sustrik	the requirement was not to drop messages, so someone has to wait till they are sent
[09:09] mikko	sustrik: take a look at this
[09:09] mikko	sec
[09:10] mikko	https://gist.github.com/25d81c09cf6a838a2aed
[09:10] mikko	seems to result into deadlock
[09:10] mikko	zmq::ctx_t::terminate (this=0x601010) at semaphore.hpp:117
[09:10] sustrik	you haven't closed the sockets
[09:10] mikko	let me close
[09:11] sustrik	thus the context has no idea whether there are more messages going to be sent or what
[09:11] mikko	because i keep getting a deadlock in php
[09:12] mikko	which i cant reproduce in plain c
[09:12] mikko	i assume it has something to do with destruction order
[09:15] mikko	Assertion failed: !prefetched (xrep.cpp:108)
[09:15] mikko	now i got this out
[09:18] mikko	also Assertion failed: inpipe_ && outpipe_ (xreq.cpp:42)
[09:18] mikko	i think i must be doing something wrong
[09:21] sustrik	mikko: that's your test program>
[09:21] sustrik	?
[09:21] sustrik	in C?
[09:21] mikko	sustrik: i can see how these happen
[09:22] mikko	yes
[09:22] mikko	C
[09:22] sustrik	can you paste it, so that i can try?
[09:23] mikko	first one: https://gist.github.com/b7b74bf1521c085aa51f
[09:23] mikko	i think i must have error there
[09:23] mikko	as it ends up blocking on recv
[09:24] sustrik	what about the assertions?
[09:24] sustrik	what version are you using?
[09:24] mikko	comment out lines 45 - 49
[09:24] sustrik	xrep.cpp:108 has no assert in HEAD
[09:24] mikko	and you will get Assertion failed: !prefetched (xrep.cpp:108)
[09:24] mikko	let me see which version i got
[09:25] mikko	i thought i got latest master but i'll recheck
[09:25] mikko	taking a fresh checkout just in case
[09:30] sustrik	when i remove the lines 45-49
[09:30] sustrik	program exits with no problem
[09:30] sustrik	when i keep them in it freezes
[09:30] mikko	it blocks on recv() ?
[09:30] mikko	is that expected or do i have some silly error there?
[09:31] mikko	sustrik: http://github.com/zeromq/zeromq2/blob/master/src/xrep.cpp#L108
[09:32] mikko	?
[09:33] sustrik	hm, you are right
[09:33] sustrik	i wonder why it's not on my box
[09:33] mikko	so commenting out lines 45-49 causes Assertion failed: !prefetched (xrep.cpp:108)
[09:35] sustrik	ack, i'll remove the assert
[09:36] sustrik	it was a patch I've applied without thinking about it sufficiently :\|
[09:36] mikko	https://gist.github.com/9cc7dbeaa1b37ff44626
[09:37] mikko	that causes
[09:37] mikko	Assertion failed: inpipe_ && outpipe_ (xreq.cpp:42)
[09:37] sustrik	as for the freeze, it's hung up in zmq_recv
[09:37] mikko	the freeze is unexpected?
[09:38] sustrik	nope
[09:38] sustrik	when using XREP
[09:38] sustrik	you have to send the identity first
[09:38] mikko	will zmq_poll show it readable?
[09:39] sustrik	when exactly?
[09:39] sustrik	btw, changing socket types to REQ/REP works OK
[09:40] mikko	it's blocking on zmq_recv, i wonder if polling socket before the recv show it as readable
[09:45] sustrik	it should not
[09:45] mikko	i can test
[09:51] mikko	zmq_poll returns it not readable
[09:51] mikko	good
[09:52] sustrik	ack
[09:52] mikko	will zmq_poll show socket non-writable if HWM has been reached?
[09:52] mikko	the inpipe/outpipe assert might be because of incorrect usage of XRE(P\|Q) sockets
[09:53] sustrik	mikko: yes
[09:53] sustrik	it will show !writeable
[09:54] sustrik	as for the assert, it should not happen even if the sockets are used in incorrect way
[09:54] sustrik	i'll check
[09:57] mikko	https://gist.github.com/e7779cdc9345967cc75e this is also supposed to block on zmq_term?
[09:57] mikko	i assume because i connect the PUB socket
[10:25] CIA-14	zeromq2: 03Martin Sustrik 07master * rf22e85f 10/ src/xrep.cpp :
[10:25] CIA-14	zeromq2: Reverting commit 1d431190f50c86f62460
[10:25] CIA-14	zeromq2: The patch was supposed to check that pipe writer sends messages
[10:25] CIA-14	zeromq2: in atomic fashion. However, it prevented the user to read
[10:25] CIA-14	zeromq2: half of a message and close the socket.
[10:25] CIA-14	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/c70jEY
[10:25] sustrik	mikko: the assert is removed from master
[10:25] mikko	good!
[10:26] sustrik	what next?
[10:26] mikko	it's odd that PUB socket close semantics are different depending on whther you bind or connect
[10:26] mikko	that might be confusing for new users
[10:26] sustrik	it's that way for all sockets
[10:26] sustrik	when you connect, a queue is created
[10:26] sustrik	the messages are stored in it
[10:27] sustrik	when you bind, there's no queue
[10:27] sustrik	as you don't even know how many peers there are going to be
[10:27] sustrik	a queue for a peer is created when the peer connects
[10:28] mikko	tricky situation, i think the current semantic for close is a bit problematic but apart from timeout i can't really think anything better either
[10:29] sustrik	yes, samw here
[10:29] sustrik	same*
[10:29] mikko	it's too easy to shoot yourself in the leg at the moment
[10:29] sustrik	you mean by blocking in term, right?
[10:29] mikko	for example if your remote peer goes down it might cause things to block eternally. in case of something like php scripts that would bring the whole site down
[10:30] sustrik	ack
[10:30] sustrik	we need to add SO_LINGER option
[10:30] sustrik	btw, reproduced the xreq.cpp:42 problem
[10:37] mikko	good!
[10:38] mikko	sustrik: even SO_LINGER is slightly undeterministic
[10:39] mikko	as the caller can't know whether it blocks due to "not being able to send" or whether it's sending but hasn't flushed everything yet
[10:39] mikko	what about making zmq_term non-blocking and returning error code if there are messages in-flight?
[10:39] mikko	that was user can handle the different scenarios as needed
[10:40] mikko	or zmq_term(ctx, 0) for blocking zmq_term(ctx, ZMQ_NOBLOCK);
[10:40] mikko	latter would come back with EAGAIN if it's still flushing stuff
[10:41] mikko	that is an API breakage but isn't api breaks possible in 2.1 ?
[10:43] mikko	it would enable to do things such as: http://gist.github.com/620337
[10:45] mikko	the blocking version could also use so_linger to determine timeout
[10:45] mikko	that way the core library doesn't need to try to give 'one size fits all' solution but to delegate it to the user
[10:46] sustrik	what's the difference between "not being able to send" and "haven't flushed everything yet"?
[10:48] mikko	not being able to send is for example if there are no lower level sockets open (not sure if context knows this)
[10:48] mikko	and the latter is when the messages are flying out to the network stack
[10:50] sustrik	by the former you mean that there wasn't zmq_bind or zmq_connect called on the socket?
[10:53] mikko	yes, that as well
[10:54] mikko	i don't know whether the context knows things about zmq_connect getting back connection refused
[10:54] mikko	and there is no active connection
[10:55] mikko	the main problem in close are 'connect'ed sockets
[10:55] mikko	i assume
[10:56] mikko	for example: 1. create pub socket 2. call zmq_connect 3. send() (under the hood socket gets connection refused) 4. close the socket 5. close the context
[10:57] mikko	in this scenario the remote peer is not there so you cannot send
[10:57] mikko	not sure if that is too much state
[10:58] sustrik	how does that differ from the case when server went down while sending the message?
[11:00] sustrik	anyway, if you want to define consistent semantics for the shutdown, you have to forget about underlying transport
[11:00] sustrik	details of how TCP works are irrelevant
[11:01] mikko	but that information is relevant to me as a user
[11:01] sustrik	why so?
[11:01] mikko	if i call close and there are 100 messages in-flight
[11:01] mikko	if the same 100 messages are there after 10 seconds i want to be able to act on it
[11:02] guido_g	because the app-developer knows how to handle the situation
[11:02] guido_g	'morning btw
[11:02] mikko	exactly, because my close semantics might depend on the data that the specific socket has been handling
[11:03] mikko	in some cases i might want to block until they are sent, even if it took days
[11:03] sustrik	so what you want is reliable delivery
[11:03] guido_g	no
[11:03] guido_g	more information on what is going on
[11:03] sustrik	either get the message to the peer or return it to the sender
[11:03] mikko	in some cases i might want to discard them if they are not being sent
[11:03] sustrik	that's what SO_LINGER is for
[11:03] guido_g	some sort of introspection of the current state of a Ã¸mq context or socket
[11:04] sustrik	impossible in distributed environment
[11:04] sustrik	the message may be in a device somewhere
[11:04] mikko	sustrik: i don't care about that
[11:04] sustrik	the library has no idea what state it is in
[11:04] guido_g	that's bad
[11:04] mikko	sustrik: as a developer all i care is that it has left my program
[11:04] mikko	or that it's not leaving my program
[11:05] mikko	think about the following scenario: i send 100 huge messages, the remote peer is consuming them but slowly. given small so_linger the messages might be discarded even if the remote peer is actually consuming
[11:06] mikko	that situation is different from a situation where the messages are in memory and are not being consumed at all
[11:06] mikko	i'm not saying that so_linger is not useful. it is for some scenarios but it's still a bit non-deterministic
[11:07] mikko	if i've closed my sockets, i'm not sending anything and the messages are not leaving my program i would like to know about that
[11:07] mikko	i dont need to care whether the remote peer is actually down or network is down. i just want to know they are not being sent and act on it
[11:07] sustrik	i think the problem in your reasoning is that you assume we know whether messages are being consumer or not
[11:07] mikko	depending on data i might choose to discard it or store locally
[11:07] sustrik	what does it exactly mean?
[11:08] sustrik	consumed*
[11:09] mikko	apart from inproc, to me it means that the message has left the current program
[11:10] sustrik	we can drop them then, no?
[11:10] mikko	as a developer i would like to choose
[11:10] mikko	keep blocking or discard
[11:14] sustrik	i still don't follow, how would you do the decision, based on what?
[11:15] mikko	i would do the decision based on the data
[11:15] mikko	(not sure if that answers the question)
[11:15] sustrik	what data?
[11:16] mikko	let me try to write down the scenarios i got in my head
[11:16] mikko	just a sec
[11:16] sustrik	you mean based on number of messages in 0mq's send buffer?
[11:17] mikko	the data that my application was handling and based on whether the send buffer is getting smaller on a period of time
[11:17] sustrik	ah, you want to shutdown depending on the throughput
[11:18] sustrik	if throughput goes below certain threshold => shutdown
[11:18] mikko	that was my original suggestion
[11:18] mikko	ages ago
[11:19] sustrik	yeah, that's semantically consistent solution
[11:19] mikko	because as an application developer i might want to do different decisions based on the data available to me: how many messages in flight? are the messages leaving my program? what kind data i was sending, can i just discard it or do i need to do more?
[11:22] sustrik	"how many messages in flight?"
[11:22] sustrik	that's messages in 0mq transmit buffer?
[11:22] mikko	yes
[11:23] mikko	i hope you see my point through this babbling
[11:23] mikko	:)
[11:23] sustrik	what about messages in TCP tx buffer?
[11:24] mikko	how large buffers are we talking about?
[11:25] sustrik	TCP tx buffer?
[11:25] sustrik	depends
[11:25] sustrik	128kB
[11:25] sustrik	1MB
[11:25] sustrik	shrug
[11:25] guido_g	on one side we're not allowed to see tcp through Ã¸mq and on the other side we're asked what we need to know about it's state, confusing
[11:26] sustrik	exactly
[11:26] sustrik	you should not see it at all
[11:27] guido_g	what i'd like to see in the future is more thought on how to get these parameters of operation out of Ã¸mq
[11:27] guido_g	for things like monitoring
[11:27] sustrik	ack
[11:28] sustrik	there are 2 levels to the monitoring imo
[11:28] sustrik	1. network monitoring
[11:28] guido_g	i -- in the role of an ops guy -- want to know how many conenctions from which host are done, if there are failures and how much per etc.
[11:28] sustrik	done on IP level
[11:29] sustrik	2. device monitoring -- connecting to 0mq device and finding out how many messages are queued there and so on
[11:29] sustrik	what's a failure?
[11:29] guido_g	also i want to correlate that with the applications state and behaviour
[11:29] guido_g	a failure is this kind of situation that ops defines as a failure
[11:29] guido_g	nothing more or less
[11:30] sustrik	can you give an example?
[11:30] guido_g	in my eyes Ã¸mq as a library should provide a way to peek into it's workings
[11:30] guido_g	monitoring != alerting
[11:30] guido_g	the monitoring is just collecting the data -- for starters
[11:31] guido_g	if i can't get key data like average queue sizes i'm basically lost
[11:32] guido_g	i know that this data isn't accurate, but it hasn't to be
[11:32] guido_g	most data is aggregated anyway
[11:32] sustrik	the problem is there's no real definition for "messages in flight"
[11:33] sustrik	if what you are worried about it memory consumption
[11:33] sustrik	you should monitor the memory used by your app
[11:33] guido_g	then stick a different label on the data and be done
[11:34] guido_g	sure, memory, cpu, ctx switches all known
[11:35] guido_g	except for the fact that (seen from app level) i can't say: for timespan ts there were 1000 messages send from node a, but only 40 received by node b
[11:35] guido_g	which amazingly correlates with the memory consumption on node a
[11:35] guido_g	and the reconnect rate of the corresponding sockets
[11:35] sustrik	wait a sec
[11:36] guido_g	sure
[11:36] sustrik	why can't you say how many messages you've sent and how many you've received?
[11:36] guido_g	this one i can do
[11:37] guido_g	but it gets a little complicated if Ã¸mq routing kicks in
[11:37] guido_g	and queueing
[11:37] guido_g	then i'm completey blind
[11:37] guido_g	obviouskly a fact i don't anticipate
[11:37] sustrik	the queueing is just a buffer, same as tcp tx buffer
[11:37] sustrik	set the HWM
[11:38] sustrik	and you have an upper limit on the buffer
[11:39] guido_g	why is it so complicated to understand that this data is kind of important?
[11:40] sustrik	because it has no clear semantics
[11:40] sustrik	if you can't say what the figure means, you don't need it
[11:40] guido_g	huh?
[11:41] sustrik	all i want is a clear definition of the figure you want 0MQ to provide
[11:42] sustrik	one that won't change arbitrarily depending on where the data is accidentally stored
[11:42] guido_g	why should I define "sematics" of data that is alreay there? shouldn't this be done beforehand?
[11:42] sustrik	whether it's in 0mq buff, tcp buff, NICs buff etc.
[11:42] guido_g	we're talking about Ã¸mq
[11:42] sustrik	let me give you an example
[11:42] guido_g	so the topic is set, no ip, tcp or moonphase
[11:43] sustrik	say you connect
[11:43] sustrik	then you send a message
[11:43] sustrik	the peer goes offline in the meantime
[11:43] sustrik	what's the number of "messages in flight"?
[11:44] guido_g	not in flight
[11:45] guido_g	there is a number of messages in the queue
[11:45] sustrik	ok, so what's the number of messages in queue
[11:45] guido_g	this would be one of the numbers people might be interested in
[11:45] sustrik	?
[11:45] guido_g	how much messages are in the send queue or queues
[11:46] sustrik	1?
[11:46] sustrik	the problem is it depends on details of how TCP works
[11:46] sustrik	and timing
[11:46] guido_g	NO
[11:47] guido_g	it depends on how many send calls have put something into the queues, no?
[11:47] sustrik	no
[11:47] guido_g	tcp is no Ã¸mq
[11:47] sustrik	what happens is that 0mq is either able to push the message to TCP buffer
[11:48] guido_g	then it's remove from the send q, right?
[11:48] sustrik	before TCP realises the other endpoint is not available
[11:48] sustrik	or the order of events is reverse
[11:48] sustrik	i.e. TCP realises the peer is not available first
[11:48] guido_g	see, you're thinking way to deep here
[11:49] sustrik	then the message stays in 0mq buffer
[11:49] sustrik	so the figure is either 0 or 1
[11:49] sustrik	depending on tcp details
[11:49] guido_g	it just about getting some numbers, that might help to spot or trace problems and perfrmance
[11:49] sustrik	exactly
[11:50] sustrik	so let's define them in a consistent way
[11:50] sustrik	rather then depending on details of underlying network transport
[11:50] sustrik	that way you are generic, consistent and future-proof
[11:50] guido_g	as i said, number of messages in a queue is a very nice and probably useful number
[11:51] sustrik	it's a definition based on implementation details
[11:51] sustrik	real definition should be based on observable behaviour
[11:52] guido_g	no
[11:52] guido_g	because you provide an "abstration"
[11:52] sustrik	exactly
[11:52] sustrik	abstraction works only if you abstract from implementation details
[11:52] guido_g	the the visible behaviour does not show what is going on
[11:53] sustrik	i mean observable bahviour such as "memory usage"
[11:53] sustrik	that's pretty clear
[11:53] guido_g	every abstraction leaks
[11:53] guido_g	the more you want to hide, the more leakage happens
[11:53] guido_g	a bad situation for both sides
[11:54] sustrik	ok, we've got into theoretical discussion :)
[11:54] guido_g	the app-devs are using "undocumented features" to get what they want and the lib-devs try to stop that
[11:54] guido_g	sustrik: not my fault
[11:54] sustrik	:)
[11:54] sustrik	it's about layering, in a correctly designed stack
[11:55] guido_g	see, monitoring is an extremly important thing, imnsho
[11:55] sustrik	if layer N doesnt' provide enough flexibility, you shift down to layer N-1
[11:55] sustrik	guido_g: definitely
[11:55] sustrik	but let's do it right
[11:55] guido_g	i do need a lot of informations about the current state of my apps, including the comminication
[11:55] sustrik	monitoring random implementation details makes no sense
[11:56] guido_g	sure, but beeing picky on names of data isn't very helpful imho
[11:56] sustrik	we have to monitor real data
[11:56] sustrik	i don't care about name
[11:56] sustrik	what i'm saying is that size of 0mq queue is an implementation detail
[11:56] guido_g	but an important one
[11:56] guido_g	if i use Ã¸mq i know that
[11:57] guido_g	i mean, i knwo that i use Ã¸mq
[11:57] sustrik	that's because you ignore all the layers below 0mq and all the devices on your path
[11:57] guido_g	so no further abstraction is needed
[11:57] guido_g	for now and this discussion, yes
[11:57] sustrik	i still don't see what you would use the number for
[11:57] guido_g	but devices are formed with Ã¸mq so...
[11:58] sustrik	it's completely random
[11:58] sustrik	if you send 200kB of messages
[11:58] guido_g	no
[11:58] sustrik	and there's TCP tx buffer of 120 kB
[11:58] sustrik	you'll have 80kB in 0mq queue
[11:58] sustrik	if the TCP buffer is accidentally set to 200kB
[11:58] sustrik	the 0mq queue will be empty
[11:58] guido_g	see it as an indicator
[11:59] sustrik	exactly, it's an indicator
[11:59] guido_g	w/o the data you will loose information on what the whole system is doing
[11:59] sustrik	try to define what it is indicating
[11:59] sustrik	then try to find a consistent indicator
[11:59] guido_g	but with this indicator at hand, you might find a way to predict upcomming problems or shortcomming etc.
[11:59] guido_g	this is the whole point of monitoring
[12:00] guido_g	and if this number is already a problem, then wait for the tcp connection details one might need...
[12:03] guido_g	ok, what number can Ã¸mq provide which reflects the number of messages that the application has sent but that are not put into the lower layer for delivery?
[12:04] guido_g	i mean, there must be a point where Ã¸mq treats a message as delivered (in the sense that the lower level has taken responsibility)
[12:07] sustrik	it's on 0MQ API
[12:08] sustrik	when you call zmq_send, you transfer the responsibility
[12:10] guido_g	to Ã¸mq
[12:10] guido_g	but between the send on the app side and the send from Ã¸mq to os is "something"
[12:11] sustrik	well, yes
[12:11] sustrik	and?
[12:11] sustrik	there's some 6 layers of functionality below zmq_send call
[12:11] sustrik	most of them doing some buffering
[12:11] guido_g	and because this "something" is quite important, one needs to know if "something" is feeling well etc.
[12:12] sustrik	understand me right, i am not against monitoring
[12:12] sustrik	i just want to monitor matrics that have real meaning
[12:12] guido_g	ok
[12:12] sustrik	let's rather start from use cases
[12:13] guido_g	above i gave one
[12:14] guido_g	"messages" put into Ã¸mq via send vs. "messages" removed von Ã¸mq responsibility
[12:14] guido_g	ops
[12:14] sustrik	that's not a use case
[12:14] sustrik	that a solution
[12:14] guido_g	it is
[12:14] sustrik	use case is "what you want to do"
[12:14] guido_g	no
[12:15] sustrik	:)
[12:15] sustrik	anyway, what do you want to do?
[12:15] sustrik	i can see two options:
[12:15] guido_g	most infrastructure things are not very well descibed by use-cases
[12:15] sustrik	1. memory monitoring
[12:15] sustrik	2. latency monitorring
[12:15] guido_g	monitoring in itself is not a closed system that can be described statically
[12:16] sustrik	c'mon you have to know what you want :)
[12:16] guido_g	for all these points we need some numbers, right?
[12:16] sustrik	yes, we need metrics to monitor
[12:16] guido_g	i know what i want now, yes
[12:16] guido_g	but i cant know what ops will need in 3 month/years
[12:17] sustrik	solve those then
[12:17] guido_g	but i've to provide as much of possibilities as possible
[12:17] guido_g	that's my job
[12:17] guido_g	if you don't have the data, you can't
[12:17] sustrik	my job is to cur possibilities :)
[12:17] sustrik	cut
[12:17] guido_g	good
[12:18] sustrik	some balance may result from us two discussing
[12:18] guido_g	yes
[12:18] sustrik	basically, 0mq resulted from taking a corporate middleware and cutting everything not strictly needed off
[12:19] guido_g	and now we need to put things back in, otherwise it's not useable for larger projects
[12:19] sustrik	so when adding a feature back we need a serious understanding of why it's needed
[12:19] guido_g	where larger is more then a handfull of nodes
[12:19] sustrik	otherwise we'll end up back in corporate middleware sphere
[12:19] sustrik	agreed
[12:19] sustrik	but extreme caution is needed
[12:20] guido_g	ack
[12:20] guido_g	one of the "key features" of Ã¸mq is its size
[12:20] sustrik	low memory footprint
[12:20] sustrik	right
[12:20] guido_g	and slick api
[12:20] sustrik	yes
[12:21] sustrik	we need a way to keep the memory footprint low
[12:21] sustrik	i am aware of that
[12:21] sustrik	HWM is already implemented
[12:21] sustrik	we need "max message size" option
[12:21] sustrik	as well
[12:21] sustrik	but that's orthogonal to monitoring
[12:22] guido_g	i think we should not start on the api side of monitoring
[12:22] guido_g	we should start by finding "interesting" data points in Ã¸mq
[12:23] sustrik	ack
[12:23] sustrik	so, can you produce a monitoring use case?
[12:23] guido_g	i we have that, i'm sure there will be a consistent way to gain access to them
[12:23] guido_g	i'll try
[12:23] sustrik	that'll be great
[12:23] sustrik	are you going to arrive at amsterdam btw?
[12:24] guido_g	hmmm...
[12:24] guido_g	would be the most expensive beer i ever had
[12:24] sustrik	same here
[12:24] mato	hi guys
[12:25] sustrik	anyway, i think we should so a conference later on anyway
[12:25] guido_g	but on the other hand, would be nice to discuss face to face (and scare away innocent bystanders :)
[12:25] mato	sustrik: check also brussels, pieter was offering space to crash at his place
[12:26] sustrik	i though of doing some event during/after FOSSDEM
[12:26] sustrik	that's february
[12:26] sustrik	and makes the whole thing more worth of coming
[12:26] sustrik	as you can attend the conference as well
[12:27] guido_g	sounds good
[12:30] guido_g	ok, need to do soemthing for my health (besides eating :)
[12:31] guido_g	will come up with some ideas regarding monitoring
[12:33] sustrik	great
[12:33] sustrik	thanks
[13:20] mikko	sustrik: http://github.com/mkoppanen/php-zmq/issues#issue/11 does this look familiar?
[13:20] mikko	seems like the segfault happens inside uuid
[13:45] sustrik	mikko: no
[13:45] sustrik	yes, the segfault is inside uuid
[13:46] sustrik	it's either invalid buffer passed to uuid_generate
[13:46] sustrik	or a bug in libuuid
[13:46] sustrik	anyway, hard to say what have gone wrong without reproducing the case
[13:47] mikko	i remember seeing this ages ago. it was due to linking order of libuuid
[13:47] sustrik	oh my
[13:47] mikko	http://usrportage.de/archives/922-PHP-segfaulting-with-pecluuid-and-peclimagick.html
[13:48] mikko	someone blogged about similar issue where two modules are linked against libuuid
[13:48] mikko	and it was fixed by changing the loading order of them
[13:48] mikko	which sounds pretty strange
[13:50] sustrik	well, if libuuid has some code hooked to the loading of the library
[13:50] sustrik	some strange misinteraction may happen
[13:50] sustrik	causing it to be used before it is initialised
[13:53] sustrik	maybe it's initalised twice
[13:53] sustrik	then deinitialised once
[13:53] sustrik	then called
[13:53] sustrik	?
[13:54] mikko	im just reading through libuuid code
[13:56] mikko	it's not that
[13:57] mikko	the guy commented
[13:57] mato	sustrik: see my email re the version patch, please put back the two lines I asked for, you've broken make dist
[13:58] mato	sustrik: that and the version number propagation to doc/Makefile
[14:00] mikko	sustrik: rather interesting valgrind output
[14:05] mato	mikko: rpath patch has been sent off to sustrik for applying
[14:05] mikko	mato: nice
[14:05] mikko	i found hudson iphone application
[14:06] mikko	i've been checking the builds even on the move :)
[14:06] mato	:-)
[14:10] sustrik	mato: hey
[14:10] sustrik	should i apply patches to the build system?
[14:10] mato	sustrik: damnit well, you just did
[14:10] mato	sustrik: and you broke it
[14:11] mato	sustrik: so now please fix what you broke :-)
[14:11] sustrik	i mean from procedural point of view
[14:11] sustrik	am i the only committer?
[14:11] mato	sustrik: you're the only committer to the github hosted-repository, yes
[14:11] mato	sustrik: that's the way it should work
[14:11] sustrik	ok
[14:12] mato	sustrik: otherwise things get problematic due to the maint branch
[14:12] mikko	no more holidays for Martin
[14:12] mato	:-)
[14:12] sustrik	ugh
[14:12] sustrik	:)
[14:13] mato	sustrik: it'll help a lot if you eventually use a real mail client and/or pull requests
[14:13] mato	sustrik: since as I showed you, applying X patches becomes one command
[14:13] mato	no hand work involved
[14:14] sustrik	you have to show me how to do that later on
[14:15] mato	will do, but you'll have to move to a better mail client
[14:15] mato	since Thunderbird doesn't understand "Save As" means "Save this without mangling it" :-)
[14:41] CIA-14	zeromq2: 03Martin Sustrik 07maint * r6cd0867 10/ configure.in :
[14:41] CIA-14	zeromq2: Fixing the Red Hat packaging
[14:41] CIA-14	zeromq2: When adding ZMQ_VERSION macros, I incorrectly removed
[14:41] CIA-14	zeromq2: the PACKAGE_VERSION macro. Adding it back.
[14:41] CIA-14	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/amuEWp
[14:41] CIA-14	zeromq2: 03Martin Lucina 07maint * r57428db 10/ configure.in : (log message trimmed)
[14:41] CIA-14	zeromq2: configure.in: Do not patch libtool rpath handling
[14:41] CIA-14	zeromq2: For historic reasons (mainly compatbility with really old libtool), configure was
[14:41] CIA-14	zeromq2: patching libtool to not use rpath in binaries. This breaks (among other things)
[14:41] CIA-14	zeromq2: correct operation of "make check" since the test binaries may not be run with
[14:41] CIA-14	zeromq2: the correct shared library version.
[14:41] CIA-14	zeromq2: Current best practice as seen e.g. at http://wiki.debian.org/RpathIssue suggests
[14:42] sustrik	mato: done
[14:43] mikko	do those go into master as well?
[14:43] CIA-14	zeromq2: 03Martin Sustrik 07master * r6cd0867 10/ configure.in :
[14:43] CIA-14	zeromq2: Fixing the Red Hat packaging
[14:43] CIA-14	zeromq2: When adding ZMQ_VERSION macros, I incorrectly removed
[14:43] CIA-14	zeromq2: the PACKAGE_VERSION macro. Adding it back.
[14:43] CIA-14	zeromq2: Signed-off-by: Martin Sustrik <sustrik@250bpm.com> - http://bit.ly/amuEWp
[14:43] CIA-14	zeromq2: 03Martin Lucina 07master * r57428db 10/ configure.in : (log message trimmed)
[14:43] CIA-14	zeromq2: configure.in: Do not patch libtool rpath handling
[14:43] CIA-14	zeromq2: For historic reasons (mainly compatbility with really old libtool), configure was
[14:43] CIA-14	zeromq2: patching libtool to not use rpath in binaries. This breaks (among other things)
[14:43] CIA-14	zeromq2: correct operation of "make check" since the test binaries may not be run with
[14:43] CIA-14	zeromq2: the correct shared library version.
[14:43] CIA-14	zeromq2: Current best practice as seen e.g. at http://wiki.debian.org/RpathIssue suggests
[14:43] CIA-14	zeromq2: 03Martin Sustrik 07master * re168173 10/ configure.in :
[14:43] CIA-14	zeromq2: Merge branch 'maint'
[14:44] mato	sustrik: thx
[14:44] mikko	ah
[14:46] sustrik	mikko: yes?
[14:47] mikko	sustrik: building now
[14:48] mikko	All 7 tests passed
[14:48] mikko	rpath thingie fixed the build for me
[14:48] sustrik	great
[14:48] sustrik	how come all the bindings work
[14:48] sustrik	?
[14:48] mato	mikko: you might want to add 'make dist' to the build also
[14:48] sustrik	mikko: what's the link?
[14:48] mikko	http://valokuva.org:8080/
[14:48] mikko	it's building all the dependent projects atm
[14:49] mikko	yeah, means "photograph" in finnish
[14:49] mikko	mato: i'll add it
[14:49] sustrik	wow, great, i know one finnish word now
[14:49] keffo	terve!
[14:49] sustrik	though, i can't remember it :)
[14:49] mikko	mato: done
[14:53] mikko	building with 'make dist' now
[14:55] sustrik	mato: btw, there was some discussion about changing the license headers in 0mq source code
[14:55] sustrik	is there any outcome of that?
[14:55] mato	sustrik: talked about it with pieter, afaik does not need to be changed
[14:56] mato	sustrik: only the README and various supporting files (LGPL exception) need to be changed
[14:56] mato	sustrik: but not the actual source files, since neither the original copyright (iMatix) nor the license (LGPL) has changed
[14:56] sustrik	There's wrong name of the license there
[14:56] mato	sustrik: oh, only thing is, there is a wording error
[14:56] mikko	make[2]: *** No rule to make target `zmq_forwarder.1', needed by `dist-hook'. Stop.
[14:56] mato	yes, i just remembered
[14:57] mato	mikko: ja, you want asciidoc + xmltol for make dist, since it generates documentation
[14:57] mikko	im missing the doc generation tools
[14:57] mikko	hmm
[14:57] mikko	should make dist fail if those are not in place during configure?
[14:57] mato	make dist is special
[14:57] mato	in that, most users will never touch it
[14:57] mato	so, maybe, no, whatever, doesn't matter right now :-)
[14:58] mikko	hehe
[14:58] mato	sustrik: yeah, so, that stuff should be fixed, but no hurry
[14:58] mato	sustrik: TBD before a release
[14:58] mikko	installing the tools and rebuilding soon
[14:58] sustrik	akc
[14:58] sustrik	ack
[14:58] mato	sustrik: or we can do it together some time, involves writing a script
[14:59] mato	sustrik: that way you don't go changing files by hand :-)
[14:59] sustrik	i can do it by hand
[14:59] sustrik	but script is definitely better
[15:00] mato	sustrik: man, sometimes i feel you actually like feeding computers :-)
[15:00] mato	sustrik: "do it by hand" ... geez ...
[15:00] keffo	they're supposed to be fed!
[15:00] sustrik	it imposes discipline on a programmer
[15:00] sustrik	which is a good thing
[15:00] mato	fed by code, not by programmers :-)
[15:00] sustrik	a bit similar to brainwashing
[15:00] sustrik	bit still good :)
[15:00] keffo	indeed, programmers solve problems.. The best programmer is the one already done :)
[15:31] mikko	mato: http://valokuva.org:8080/job/ZeroMQ2_master/ws/zeromq-2.1.0.tar.gz
[15:31] mikko	make dist built that
[15:32] mato	mikko: great...
[15:34] mikko	was missing zip as well
[15:34] mikko	noticed
[16:00] CIA-14	zeromq2: 03Steven McCoy 07master * r5b8af52 10/ (src/pgm_receiver.cpp src/pgm_sender.cpp):
[16:00] CIA-14	zeromq2: Fix assertion in PGM transports on cancel_timer
[16:00] CIA-14	zeromq2: Signed-off-by: Steven McCoy <steven.mccoy@miru.hk> - http://bit.ly/aknF0L
[16:14] delaney	hi, i'm curious why there isn't pre-built binaries for windows on the site download page
[16:17] mato	sustrik: those patches you applied from steve, where did they come from?
[16:18] sustrik	from steve
[16:18] mato	sustrik: the reason i'm asking is that on the ML I did not see patches with a Signed-Off-By tag
[16:18] mato	sustrik: but the commit has a Signed-Off-By tag...
[16:18] mato	sustrik: so I'm confused
[16:19] sustrik	damn, i've got that wrong
[16:19] sustrik	let me ask steven to sign them off post hoc
[16:20] mato	np, you're learning ... signing them off "post hoc" won't really help anything now
[16:20] mato	anyway, no real problem
[16:20] mato	no panic
[16:21] mato	just do remember to double check what you're pushing to github makes sense :-)
[16:21] sustrik	why won't it help?
[16:21] mato	because it's not in the git history
[16:21] mato	hence not persistent
[16:22] sustrik	?
[16:22] sustrik	there's sign-off in the repo
[16:22] mato	it doesn't matter much though since the licensing is automatic
[16:22] mato	signoff is just tracking
[16:22] mato	ah, right, added by you :-)
[16:22] mato	bad you :-)
[16:22] delaney	i'm a python guy and was looking to use zmp, would it be useful to the project to include the dll i just made on the downlaod area?
[16:22] sustrik	so we need just steve to approve the sign-off now
[16:23] mato	well, there's no point
[16:23] mato	you added a signed-off-by tag
[16:23] mato	anyhow
[16:23] sustrik	by signing if off steven basically says "yes, i've created the patch myself"
[16:23] sustrik	that's it
[16:23] mato	the only point is, review everything you actually push to master :-)
[16:24] mato	i thought you liked bureaucracy :-)
[16:24] sustrik	i do
[16:24] sustrik	i just need few more patches to get it right
[16:24] sustrik	delaney: the problem is not providing binaries, rather maintaining them
[16:24] mato	or you could use the right tool... hang on... bureauracy... right... involves doing everything by hand so that the job take as long as possible :-)
[16:25] mato	i have to go
[16:25] sustrik	cya
[16:25] mato	sustrik: will make a robust patch for the version stuff, the latest idea looks ok
[16:25] mato	cyl
[16:25] sustrik	delaney: building new binaries when new version is released etc.
[16:26] delaney	yeah, true. still. i'm getting a 'Unable to find vcvarsall.bat' from easy_install but i have msvc 2010 express, any ideas?
[16:27] sustrik	no idea, sorry
[16:35] pieterh	delaney, it's normally produced by the installer if you ask for command line use
[16:38] starkdg	as an aside, is there any way to monitor buffer length in the io queues ? the number of messages ?
[16:38] starkdg	it might be a feature worth considering ?
[16:46] delaney	i'm trying to follow the http://www.zeromq.org/docs:windows-installations which i didn't see before... I'm able to build the solution but there is no libzmq.lib in the zeromq2\lib directory, only the lizmq.dll
[16:48] delaney	hmm, all seems to install to site-packages now, not sure what i did
[16:48] delaney	is that a misprint, should it be 'copy libzmq.DLL'?
[16:56] delaney	when i try to run the chat example i get http://pastebin.com/U1nrBEsU
[16:57] delaney	please excuse my c++/c n00bness
[17:02] mikko	are you compiling against github master?
[17:03] delaney	no, off the downloads, let me try that
[17:47] sustrik	delaney: it looks like you are passing invalid argument to bind
[17:48] sustrik	what's the string you are using?
[18:18] delaney	python display.py 127.0.0.1
[18:18] delaney	using the examples/chat, haven't touched the code
[18:44] sustrik	you are missing the port number i would say
[18:49] pieterh	delaney, did you read the user guide?
[19:00] delaney	ah, no i didn't thought i'd just run the examples
[19:00] delaney	that makes more sense
[20:34] rphillips	I'm running strace on a misbehaving subscriber daemon. I don't see a TCP connect after zmq_connect() with a tcp:// endpoint
[20:34] rphillips	should I?
[20:39] rgl	not immediatly. but soon a connection is going to be made in a background thread (the I/O thread of zmq).
[20:42] rphillips	strange... I don't see one
[21:09] rphillips	rgl: zdevice zmq_forwarder "tcp://127.0.0.1:22000" "tcp://127.0.0.1:22001" is creating netlink raw sockets on my system... that doesn't look correct
[21:11] rgl	maybe zmq has special handling for the loopback interface
[21:11] rgl	can you try connecting to different machines?
[21:20] rphillips	that seems to help... I'll have to submit a patch to the resolver code
[21:20] rphillips	thanks