From stevegt@TerraLuna.Org Thu Dec 8 20:36:09 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Thu, 8 Dec 2005 12:36:09 -0800 (PST) Subject: [Infrastructures] ISconf 4.2.8: HMAC Message-ID: <20051208203609.59615249E4@spirit.terraluna.org> Hi All, I've posted ISconf 4.2.8 as the latest stable release: http://trac.t7a.org/isconf/pub/isconf-4.2.8.201.tar.gz This version includes HMAC authentication of all messages on the wire. It's still UDP-based, but has fine-grained controls over network topology, multiple subnets, shutting off broadcast, and so on. See the man page for how to use all of this. For detailed information of what's changed, this query will give you the timeline of bugs and fixes back to the previous stable release: http://trac.t7a.org/isconf/timeline?from=12%2F08%2F05&daysback=90&ticket=on&changeset=on&milestone=on The 4.3.1 work, starting now, will focus on the TCP mesh code. This will give us the ability to retire UDP, simplify firewall rules, and provides the foundation we'll need later for reporting, asset management, and monitoring. Look for 4.3.2 as the stable release of this next version. For the long-term roadmap, see: http://trac.t7a.org/isconf/roadmap I'm completely sure that there are bugs which will need to be fixed in the 4.2.8 series -- while this version passes regression tests across several test nodes, I don't yet have large-scale test cases built (working on this, using Xen; should be in place before 4.3.2). Please beat up on it and let me know what you find. Thanks, Steve --- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org From curzonj@gmail.com Sat Dec 10 03:54:43 2005 From: curzonj@gmail.com (Jordan Curzon) Date: Fri, 9 Dec 2005 20:54:43 -0700 Subject: [Infrastructures] Isconf: Fetching blocks Message-ID: <9d03aa20512091954s1d36913bjdb72b924046598c7@mail.gmail.com> I have been getting the following error frequently. The error occurs after several other hosts have updated with no problems. The problem is that if I run isconf up again it starts up again but not from where it left off. Any ideas about debuging this? isconf: error: missing block: /var/is/fs/cache/internal.curzons.net/block/814/814338f5b4c910e35a55d101d972998f7b6bd949-eeb84b2ef12f9232f90d15457136d992-1: Operation not permitted From curzonj@gmail.com Sat Dec 10 18:39:27 2005 From: curzonj@gmail.com (Jordan Curzon) Date: Sat, 10 Dec 2005 11:39:27 -0700 Subject: [Infrastructures] Reload the config file when commands are run. Message-ID: <9d03aa20512101039k17294d08h5c5e73a804aaf614@mail.gmail.com> ------=_Part_12984_27360657.1134239967296 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline I know the main.cf file will be managed by the cache later, but in the mean time, I want to snap it and have it work. So I wrote a patch to reread the config file every time a command is executed. This is my first time in python so it might not be proper form, but it works. Please give soem feed back on this feature and patch. Jordan Curzon ------=_Part_12984_27360657.1134239967296 Content-Type: application/octet-stream; name="reload_config_file.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="reload_config_file.patch" SW5kZXg6IENvbmZpZy5weQo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBDb25maWcucHkJKHJldmlzaW9uIDIwMSkK KysrIENvbmZpZy5weQkod29ya2luZyBjb3B5KQpAQCAtMyw2ICszLDkgQEAKIGltcG9ydCByZQog aW1wb3J0IHN5cwogCitpbXBvcnQgaXNjb25mCitmcm9tIGlzY29uZi5HbG9iYWxzIGltcG9ydCAq CisKIGNsYXNzIENvbmZpZzoKIAogICAgIGRlZiBfX2luaXRfXyhzZWxmLGZuYW1lKToKQEAgLTQ5 LDYgKzUyLDI5IEBACiAgICAgICAgICAgICAgICAgY29udGludWUKICAgICAgICAgICAgIHNlbGYu ZXJyb3IoInVua25vd24gaW5wdXQsIHN0YXRlICVzOiAlcyIgJSAoc3RhdGUsbGluZSkpCiAKKyAg ICBkZWYgdXBkYXRlRW52aXJvbm1lbnQoZm5hbWUpOgorCQlob3N0bmFtZSA9IG9zLmVudmlyb25b J0hPU1ROQU1FJ10KKworCQlpZiBvcy5wYXRoLmV4aXN0cyhmbmFtZSk6CisJCQljb25mID0gQ29u ZmlnKGZuYW1lKQorCQkJdmFycyA9IGNvbmYubWF0Y2goaG9zdG5hbWUpCisJCQlkZWJ1ZygiYWRk aW5nIHRvIGVudmlyb25tZW50OiAlcyIgJSBzdHIodmFycykpCisJCQlmb3IgKHZhcix2YWwpIGlu IHZhcnMuaXRlbXMoKToKKwkJCQlvcy5lbnZpcm9uW3Zhcl09dmFsCisJCWVsc2U6CisJCQlkZWJ1 ZygiJXMgbm90IGZvdW5kIC0tIHVzaW5nIGRlZmF1bHRzIiAlIGZuYW1lKQorCisgICAgICAgZG9t Zm4gPSBvcy5wYXRoLmpvaW4ob3MuZW52aXJvblsnSVNfSE9NRSddLCJjb25mL2RvbWFpbiIpCisg ICAgICAgIGlmIG9zLnBhdGguZXhpc3RzKGRvbWZuKToKKyAgICAgICAgICAgIG9zLmVudmlyb25b J0lTX0RPTUFJTiddID0gb3Blbihkb21mbiwncicpLnJlYWQoKS5zdHJpcCgpCisgICAgICAgIGVs aWYgbm90IG9zLmVudmlyb24uaGFzX2tleSgnSVNfRE9NQUlOJyk6CisgICAgICAgICAgICBlcnJv cigiJXMgaXMgbWlzc2luZyAtLSBzZWUgaW5zdGFsbCBpbnN0cnVjdGlvbnMiICUgZG9tZm4pCisK KyAgICAgICAgCisgICAgICAgIGRlYnVnKG9zLnBvcGVuKCJlbnYiKS5yZWFkKCkpCisKKyAgICB1 cGRhdGVFbnZpcm9ubWVudCA9IHN0YXRpY21ldGhvZCh1cGRhdGVFbnZpcm9ubWVudCkKKwogICAg ICMgWFhYIGNvbnZlcnQgdG8gZ2xvYmFsIGVycm9yCiAgICAgZGVmIGVycm9yKHNlbGYsbXNnKToK ICAgICAgICAgcmFpc2UgQ29uZmlndXJhdGlvbkVycm9yKCIlcyBsaW5lICVkOiAlcyIgJSAoc2Vs Zi5mbmFtZSxzZWxmLmksbXNnKSkKQEAgLTY3LDQgKzkzLDYgQEAKICAgICAgICAgICAgICAgICBi cmVhawogICAgICAgICByZXR1cm4gdmFycwogCisKKwogY2xhc3MgQ29uZmlndXJhdGlvbkVycm9y KEV4Y2VwdGlvbik6IHBhc3MKSW5kZXg6IElTY29uZi5weQo9PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBJU2NvbmYu cHkJKHJldmlzaW9uIDIwMSkKKysrIElTY29uZi5weQkod29ya2luZyBjb3B5KQpAQCAtMjIsNiAr MjIsNyBAQAogaW1wb3J0IHRpbWUKIGltcG9ydCBpc2NvbmYKIGZyb20gaXNjb25mLkVycm5vIGlt cG9ydCBpc2Vycm5vCitmcm9tIGlzY29uZi5Db25maWcgaW1wb3J0IENvbmZpZwogZnJvbSBpc2Nv bmYuR2xvYmFscyBpbXBvcnQgKgogZnJvbSBpc2NvbmYgaW1wb3J0IElTRlMKIGZyb20gaXNjb25m Lktlcm5lbCBpbXBvcnQga2VybmVsLCBCdXMKQEAgLTEyNSw2ICsxMjYsNyBAQAogICAgICAgICAg ICAgICAgICMgWFhYIHdoeSBkb24ndCB3ZSBqdXN0IHBhc3MgdGhlIHdob2xlIG1lc3NhZ2UgdG8g T3BzKCk/CiAgICAgICAgICAgICAgICAgb3B0WydyZWJvb3Rfb2snXSA9IG1zZy5oZWFkLnJlYm9v dF9vawogICAgICAgICAgICAgICAgIGRlYnVnKG9wdCkKKyAgICAgICAgICAgICAgICBDb25maWcu dXBkYXRlRW52aXJvbm1lbnQob3B0Wydjb25maWcnXSkKICAgICAgICAgICAgICAgICB2ZXJiID0g bXNnWyd2ZXJiJ10KICAgICAgICAgICAgICAgICBkZWJ1ZygidmVyYiBpbiBwcm9jZXNzIix2ZXJi KQogICAgICAgICAgICAgICAgIGlmIHZlcmIgIT0gJ2xvY2snIGFuZCBvcHRbJ21lc3NhZ2UnXSAh PSBOb25lOgpJbmRleDogTWFpbi5weQo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBNYWluLnB5CShyZXZpc2lvbiAy MDEpCisrKyBNYWluLnB5CSh3b3JraW5nIGNvcHkpCkBAIC01MywyNiArNTMsOSBAQAogICAgICAg ICBvcy5lbnZpcm9uLnNldGRlZmF1bHQoJ0lTX0hUVFBfUE9SVCcsIjY1MDI4IikKICAgICAgICAg aG9zdG5hbWUgPSBvcy5wb3BlbignaG9zdG5hbWUnLCdyJykucmVhZCgpLnN0cmlwKCkKICAgICAg ICAgb3MuZW52aXJvbi5zZXRkZWZhdWx0KCdIT1NUTkFNRScsaG9zdG5hbWUpCi0gICAgICAgIGhv c3RuYW1lID0gb3MuZW52aXJvblsnSE9TVE5BTUUnXQotICAgICAgICAKLSAgICAgICAgaWYgb3Mu cGF0aC5leGlzdHMoZm5hbWUpOgotICAgICAgICAgICAgY29uZiA9IENvbmZpZyhmbmFtZSkKLSAg ICAgICAgICAgIHZhcnMgPSBjb25mLm1hdGNoKGhvc3RuYW1lKQotICAgICAgICAgICAgZGVidWco ImFkZGluZyB0byBlbnZpcm9ubWVudDogJXMiICUgc3RyKHZhcnMpKQotICAgICAgICAgICAgZm9y ICh2YXIsdmFsKSBpbiB2YXJzLml0ZW1zKCk6Ci0gICAgICAgICAgICAgICAgb3MuZW52aXJvblt2 YXJdPXZhbAotICAgICAgICBlbHNlOgotICAgICAgICAgICAgZGVidWcoIiVzIG5vdCBmb3VuZCAt LSB1c2luZyBkZWZhdWx0cyIgJSBmbmFtZSkKIAotICAgICAgICBkb21mbiA9IG9zLnBhdGguam9p bihvcy5lbnZpcm9uWydJU19IT01FJ10sImNvbmYvZG9tYWluIikKLSAgICAgICAgaWYgb3MucGF0 aC5leGlzdHMoZG9tZm4pOgotICAgICAgICAgICAgb3MuZW52aXJvblsnSVNfRE9NQUlOJ10gPSBv cGVuKGRvbWZuLCdyJykucmVhZCgpLnN0cmlwKCkKLSAgICAgICAgZWxpZiBub3Qgb3MuZW52aXJv bi5oYXNfa2V5KCdJU19ET01BSU4nKToKLSAgICAgICAgICAgIGVycm9yKCIlcyBpcyBtaXNzaW5n IC0tIHNlZSBpbnN0YWxsIGluc3RydWN0aW9ucyIgJSBkb21mbikKLSAgICAgICAgCisJQ29uZmln LnVwZGF0ZUVudmlyb25tZW50KGZuYW1lKQogCi0gICAgICAgIGRlYnVnKG9zLnBvcGVuKCJlbnYi KS5yZWFkKCkpCi0KICAgICBkZWYgbWFpbihzZWxmKToKICAgICAgICAgc3lub3BzaXMgPSAiIiIK ICAgICAgICAgaXNjb25mIFstRGhycVZdIFstYyBjb25maWcgXSBbLW0gbWVzc2FnZV0ge3ZlcmJ9 IFt2ZXJiIGFyZ3MgLi4uXQo= ------=_Part_12984_27360657.1134239967296-- From stevegt@TerraLuna.Org Sun Dec 11 07:58:13 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Sat, 10 Dec 2005 23:58:13 -0800 Subject: [Infrastructures] Isconf: Fetching blocks In-Reply-To: <9d03aa20512091954s1d36913bjdb72b924046598c7@mail.gmail.com> References: <9d03aa20512091954s1d36913bjdb72b924046598c7@mail.gmail.com> Message-ID: <20051211075813.GB4086@terraluna.org> Hi Jordan, On Fri, Dec 09, 2005 at 08:54:43PM -0700, Jordan Curzon wrote: > I have been getting the following error frequently. The error occurs > after several other hosts have updated with no problems. The problem > is that if I run isconf up again it starts up again but not from where > it left off. Any ideas about debuging this? > > > isconf: error: missing block: > /var/is/fs/cache/internal.curzons.net/block/814/814338f5b4c910e35a55d101d972998f7b6bd949-eeb84b2ef12f9232f90d15457136d992-1: > Operation not permitted What this means is that the machine showing this error is not getting that file from any other machine. It means that the machine *is*, however, getting the journal file from some other machine, so we know they have seen each other at least once on the net. Just for a double check, you should be able to see the journal at this path: /var/is/fs/cache/internal.curzons.net/volume/{branchname}/journal Inside the journal, you should be able to find the 'snap' transaction in question by looking for the entry with the 814338f5b4c91... block ID. When this transaction fails, the next 'isconf up' on the same machine should retry the same (previously failed) transaction first. Here's what I need to know: - What isconf version are you running? - Are these machines on the same subnet? - What's the network load average look like? (Since current versions still use UDP for the 'whohas' messages, it's *possible* that we're just dropping all of the 'whohas' packets when they hit the net.) - How big is the 81433... file? (I've been concerned about some implied but fuzzy timeouts when transferring large files, but haven't prioritized this so far because they will go away with the TCP mesh code.) - Is it always the same file? - Is it always the same host? - The next time this happens, can you send me the /tmp/isconf.* log files on the machine where this has happened? - You say "it starts up again but not from where it left off" -- are you sure it's not retrying the failed transaction, quickly succeeding this time, then continuing? (I hafta ask.) Regardless, the next time you have a failed transfer, do this: - save a copy of the journal and of /var/is/conf/history - run 'isconf up' the second time - if it looks like it didn't restart where you think it should have, then copy the display contents, grab another copy of the journal and history, and send me the display contents, the "before' and "after" copies of the journal and history, and the /tmp/isconf.* log files. The reason I ask for the journal and history is that the history's whole purpose in life is to track what's been executed. It would be, well, very strange for the journal replay to start from anywhere else, so now you've got me all paranoid and stuff. ;-) As far as debugging this yourself, you can try 'tail -f /tmp/isconf.log', matching the debug messages in there with the debug() calls in the code, to see if you can divine the flow of what's happening while you run 'isconf up' etc. on the victim machine. You'll see debug messages from several microtasks interleaved at the same time, but once you get used to that it's pretty straightforward. If this is a networking issue, then you'll probably be spending some time in Cache.py, and maybe HTTPServer.py. See the comments in Kernel.py for more information about the whole microtasks thing, and feel free to edit or add pages in the wiki as you go. There's some information in there that I wrote while thinking through the architecture and so on, but we really need to start a hacking howto. The wire protocol flow may not be apparent from looking at the Cache.py code; in general think "arp + http". Host A, running 'isconf up' sends a "whohas" broadcast asking for the file. You will see the complete text of these messages in the isconf.log file; I have full debug logging on by default right now. Any host which has the file sends an "ihave" response, but host A will ignore all but the first. Let's say host A hears the "ihave" from host B first; host A then sends an HTTP GET to host B, and host B returns the file in the HTTP response. In 4.3 we're deprecating HTTP in favor of "sendme/hereis" transfers via the TCP mesh. The same "whohas" and "ihave" messages will still be around, carried by the mesh rather than by UDP; this gives us reliable message passing and removes the need for the many retries you'll see when you dig into the logs. In later 4.3 releases we add signatures and encryption. Steve From juri@fab-it.dk Tue Dec 13 22:24:05 2005 From: juri@fab-it.dk (Juri Rischel Jensen) Date: Tue, 13 Dec 2005 23:24:05 +0100 Subject: [Infrastructures] Questions about isconf4... Message-ID: <6AEDABD0-D160-436C-BEAF-A22027D3B515@fab-it.dk> Hi all, I'm in the process of finding the right configuration automation tool for my shop. I've looked at isconf several times over the last five years, but have been reluctant to try it as I couldn't see it fit into our systems. I've also found the documentation to be lacking when it came to instructions/guidelines to actual deployment and I'm really glad to see that this has changed a lot in version 4. And beeing implemented in Python also adds to my final vote. Anyways, my problem is that, although I've read all the documentation and skimmed the messages from the last 3-4 months in the mailarchives, I still don't exactly understand how I'm supposed to use isconf4. Let me explain in more detail: 1. The documentation says that I should keep the branch count down. I can make sense of that, but what if I have 3 webservers in my domain, have them share the same branch and then on hostA do a isconf lock "Enabling new_apache_vhost" isconf snap new_apache_vhostfile.conf isconf exec a2ensite new_apache_vhostfile.conf isconf exec /etc/init.d/apache2 force-reload isconf ci Then I have a history of what I've done on hostA, I have my newly added vhosts config file in the isfs and can replay that journal entry again if needed in the future. But as I understand it, the same journal entry gets excuted on hostB and hostC because they share the same branch as hostA. Please correct me if I'm wrong here. 2. I can see in the journal file that every entry gets an ID. Have you planned on implementing a "changelog" verb - eg. "isconf changelog" to see the history? 3. In our shop we do system administration for several customers and need to keep some of the configuration separate (the stuff that's different). If I should implement a solution based on isconf4, as I see it I would have to manually build the whole configuration structure from the bottom by issuing isconf commands in the right order. I can't reuse some of the things from one customer when starting up a new one, as I can't share branches between domains. I'm looking for a tool where I can reuse as much of my work as possible when starting a new site (domain in isconf lingo). I probably have more questions, but can't remember them right now. Too tired. I'm looking forward for an answer... -- Med venlig hilsen Juri Rischel Jensen Fab:IT ApS Vesterbrogade 50 DK-1620 København Tlf: 70 202 407 / Fax: 33 313 640 www.fab-it.dk / juri@fab-it.dk From stevegt@TerraLuna.Org Wed Dec 14 11:04:02 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Wed, 14 Dec 2005 03:04:02 -0800 Subject: [Infrastructures] Questions about isconf4... In-Reply-To: <6AEDABD0-D160-436C-BEAF-A22027D3B515@fab-it.dk> References: <6AEDABD0-D160-436C-BEAF-A22027D3B515@fab-it.dk> Message-ID: <20051214110402.GA9351@terraluna.org> Hi Juri, On Tue, Dec 13, 2005 at 11:24:05PM +0100, Juri Rischel Jensen wrote: > 1. The documentation says that I should keep the branch count down. > I can make sense of that, but what if I have 3 webservers in my > domain, have them share the same branch and then on hostA do a > > isconf lock "Enabling new_apache_vhost" > isconf snap new_apache_vhostfile.conf > isconf exec a2ensite new_apache_vhostfile.conf > isconf exec /etc/init.d/apache2 force-reload > isconf ci In all likelihood, you don't actually want to snap new_apache_vhostfile.conf -- you instead want to generate it from whatever your current customer/vhost database says. You use isconf to manage the executables which do that generation; those executables talk to the database (or flat files, or LDAP, etc.) to get the latest and greatest data, rather than cause it to evolve as machines are built. If you have no database and just usually edit new_apache_vhostfile.conf directly instead, then read on... Also see the section on environmental data in the man page. What's missing in isconf4 right now is the native configuration file management bits which were there in isconf versions 1-3. In version 1 we used SUP, in 2 and 3 we used rsync; the former was good because it gave us post-replication triggers (to handle e.g. the force-reload); the latter was bad because rsync doesn't do triggers. As Jordan hinted a couple of days ago, isconf4 is going to sync config files from the distributed cache rather than a central server, but otherwise it's the same idea; sync the file, check for triggers, run them, go to the next file. The main reason isconf4 doesn't yet do this is because the people paying for the lion's share of isconf development right now are already managing these files using higher-layer systems like the database I describe above, so I haven't yet prioritized writing the isconf code which would do the job natively. (Okay, just now created ticket 62 to track this -- http://trac.t7a.org/isconf/ticket/62). > Then I have a history of what I've done on hostA, I have my newly > added vhosts config file in the isfs and can replay that journal > entry again if needed in the future. But as I understand it, the > same journal entry gets excuted on hostB and hostC because they > share the same branch as hostA. Please correct me if I'm wrong > here. You are correct. > 2. I can see in the journal file that every entry gets an ID. Have > you planned on implementing a "changelog" verb - eg. "isconf > changelog" to see the history? Yes. It's a lower priority 'cause, well, you did find the journal file, didn't you? ;-) If you (or anyone) wanted to craft a standalone tool, the isconf.fbp822 module is what parses the message format that file uses. If anyone does write this, feel free to post the code in a ticket and I'll integrate it in as an isconf subcommand. > 3. In our shop we do system administration for several customers and > need to keep some of the configuration separate (the stuff that's > different). If I should implement a solution based on isconf4, as I > see it I would have to manually build the whole configuration > structure from the bottom by issuing isconf commands in the right > order. There's an ongoing dispute about the meaning of the word "configuration" -- so I have to ask; do you mean executables, or customer-specific configuration files? The former are what isconf4 is currently intended to handle; the latter fall into the same category as new_apache_vhostfile.conf. > I can't reuse some of the things from one customer when > starting up a new one, as I can't share branches between domains. > I'm looking for a tool where I can reuse as much of my work as > possible when starting a new site (domain in isconf lingo). I'm not (yet) convinced you want to use different domains for this -- have you already read http://trac.t7a.org/isconf/wiki/DomainsVsBranches? What you've said so far makes me think you might want to use a single, neutrally-named domain. Can you elaborate on why the different branches need to be in different domains? Who has physical access to the hardware, who has root, etc? If they're all in the same domain, then you can merge and hack branches simply by using your favorite editor/merge tool on the journal files for the two branches, taking care to not "edit history". If they are in different domains, then you'd need to copy over the ./block contents as well. Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org From curzonj@gmail.com Mon Dec 19 14:49:16 2005 From: curzonj@gmail.com (Jordan Curzon) Date: Mon, 19 Dec 2005 07:49:16 -0700 Subject: [Infrastructures] ISconf: Cache.py - bcast method Message-ID: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> Keeps crashing on me on the last line of the bcast method in Cache.py. It throws a socket.error exception with the EAGAIN error. I did some looking and other places in the code ignore that error. I trapped the exception and everything runs fine. Is that a bug or am I misunderstanding things. Jordan Curzon From Daniel Hagerty Mon Dec 19 20:34:31 2005 From: Daniel Hagerty (Daniel Hagerty) Date: Mon, 19 Dec 2005 15:34:31 -0500 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> Message-ID: <17319.6487.527329.140101@perdition.linnaean.org> > Keeps crashing on me on the last line of the bcast method in Cache.py. > It throws a socket.error exception with the EAGAIN error. I did some > looking and other places in the code ignore that error. I trapped the > exception and everything runs fine. > > Is that a bug or am I misunderstanding things. Not speaking for what isconf is *supposed* to do, but it's almost universally the case that the correct response to EAGAIN is to try the system call that failed again. Something in the kernel interuptted the system call, preventing it from completing. You usually want to complete whatever it is, rather than pretending the kernel performed the task, when in fact it didn't. (There are some system call toolkits that actually go so far as to prevent you seeing EAGAIN in high level interfaces -- if you really want to see EAGAIN, the low level interface is still there). From stevegt@TerraLuna.Org Mon Dec 19 21:28:10 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Mon, 19 Dec 2005 13:28:10 -0800 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> Message-ID: <20051219212809.GD9351@terraluna.org> On Mon, Dec 19, 2005 at 07:49:16AM -0700, Jordan Curzon wrote: > Keeps crashing on me on the last line of the bcast method > in Cache.py. It throws a socket.error exception with the > EAGAIN error. I did some looking and other places in the > code ignore that error. I trapped the exception and > everything runs fine. > > Is that a bug or am I misunderstanding things. You're getting an EAGAIN from a UDP sendto(), right? Bizarre. That means that the operation would block (and I have the socket set for non-blocking). Not sure what would cause that in UDP. Without knowing what's causing this, I'm not sure whether the correct action is to trap and discard the exception, or to yield and retry later. Some ideas: - If you're using a nets file, check to make sure all of the IP addresses in there are valid and routable. Check your routing table as well... Add a debug() call to your patch to show the address that's failing. - Check to see if there's something seriously wrong with the IP stack on the machine -- running out of mbufs maybe? What else is the machine doing? Anyone else have any ideas for what might cause a UDP sendto() to block? I've created bug #63 to track this: http://trac.t7a.org/isconf/ticket/63 Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org From Daniel Hagerty Mon Dec 19 22:04:47 2005 From: Daniel Hagerty (Daniel Hagerty) Date: Mon, 19 Dec 2005 17:04:47 -0500 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <17319.6487.527329.140101@perdition.linnaean.org> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <17319.6487.527329.140101@perdition.linnaean.org> Message-ID: <17319.11903.868786.934925@perdition.linnaean.org> > From: Daniel Hagerty > Date: Mon, 19 Dec 2005 15:34:31 -0500 > > Not speaking for what isconf is *supposed* to do, but it's almost > universally the case that the correct response to EAGAIN is to try the [...] Never mind me, I was thinking of EINTR. I've always called EAGAIN by its other name, EWOULDBLOCK (or at least, I've never seen anyplace where they weren't synonymous). From curzonj@gmail.com Mon Dec 19 23:49:57 2005 From: curzonj@gmail.com (Jordan Curzon) Date: Mon, 19 Dec 2005 16:49:57 -0700 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <20051219212809.GD9351@terraluna.org> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> Message-ID: <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> Yes, from the sendto call. The machine is just for testing ISCONF. It is a clean install of Ubuntu with simple routing (just gets a dhcp address). I am not using the IS_NETS file and the address that fails is . Also, the call succedes for some number of calls then throws the exception. On 12/19/05, Steve Traugott wrote: > On Mon, Dec 19, 2005 at 07:49:16AM -0700, Jordan Curzon wrote: > > Keeps crashing on me on the last line of the bcast method > > in Cache.py. It throws a socket.error exception with the > > EAGAIN error. I did some looking and other places in the > > code ignore that error. I trapped the exception and > > everything runs fine. > > > > Is that a bug or am I misunderstanding things. > > You're getting an EAGAIN from a UDP sendto(), right? > Bizarre. That means that the operation would block (and I > have the socket set for non-blocking). Not sure what would > cause that in UDP. Without knowing what's causing this, I'm > not sure whether the correct action is to trap and discard > the exception, or to yield and retry later. Some ideas: > > - If you're using a nets file, check to make sure all of the > IP addresses in there are valid and routable. Check your > routing table as well... Add a debug() call to your patch > to show the address that's failing. > > - Check to see if there's something seriously wrong with the > IP stack on the machine -- running out of mbufs maybe? > What else is the machine doing? > > Anyone else have any ideas for what might cause a UDP > sendto() to block? > > I've created bug #63 to track this: http://trac.t7a.org/isconf/ticket/63 > > Steve > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org > From stevegt@TerraLuna.Org Tue Dec 20 02:09:28 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Mon, 19 Dec 2005 18:09:28 -0800 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> Message-ID: <20051220020928.GB14072@terraluna.org> Curiouser and curiouser. What happens if you set the socket to blocking before the sendto() call, then back to nonblocking again afterwards? I.E.: self.sock.setblocking(1) self.sock.sendto(msg,0,(addr,self.udpport)) self.sock.setblocking(0) Has this only happened on one machine? Out of how many? Does dmesg say anything interesting (e.g. link state transients)? Steve On Mon, Dec 19, 2005 at 04:49:57PM -0700, Jordan Curzon wrote: > Yes, from the sendto call. The machine is just for testing ISCONF. It > is a clean install of Ubuntu with simple routing (just gets a dhcp > address). I am not using the IS_NETS file and the address that fails > is . Also, the call succedes for some number of calls then > throws the exception. > > On 12/19/05, Steve Traugott wrote: > > On Mon, Dec 19, 2005 at 07:49:16AM -0700, Jordan Curzon wrote: > > > Keeps crashing on me on the last line of the bcast method > > > in Cache.py. It throws a socket.error exception with the > > > EAGAIN error. I did some looking and other places in the > > > code ignore that error. I trapped the exception and > > > everything runs fine. > > > > > > Is that a bug or am I misunderstanding things. > > > > You're getting an EAGAIN from a UDP sendto(), right? > > Bizarre. That means that the operation would block (and I > > have the socket set for non-blocking). Not sure what would > > cause that in UDP. Without knowing what's causing this, I'm > > not sure whether the correct action is to trap and discard > > the exception, or to yield and retry later. Some ideas: > > > > - If you're using a nets file, check to make sure all of the > > IP addresses in there are valid and routable. Check your > > routing table as well... Add a debug() call to your patch > > to show the address that's failing. > > > > - Check to see if there's something seriously wrong with the > > IP stack on the machine -- running out of mbufs maybe? > > What else is the machine doing? > > > > Anyone else have any ideas for what might cause a UDP > > sendto() to block? > > > > I've created bug #63 to track this: http://trac.t7a.org/isconf/ticket/63 > > > > Steve > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > > _______________________________________________ > Infrastructures mailing list > Infrastructures@mailman.terraluna.org > http://mailman.terraluna.org/mailman/listinfo/infrastructures -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org From Tarjei.Jensen@akerkvaerner.com Tue Dec 20 06:21:08 2005 From: Tarjei.Jensen@akerkvaerner.com (Tarjei.Jensen@akerkvaerner.com) Date: Tue, 20 Dec 2005 07:21:08 +0100 Subject: [Infrastructures] ISconf: Cache.py - bcast method Message-ID: Jordan Curzon wrote: > Keeps crashing on me on the last line of the bcast method in Cache.py. > It throws a socket.error exception with the EAGAIN error. I > did some looking and other places in the code ignore that > error. I trapped the exception and everything runs fine. > > Is that a bug or am I misunderstanding things. EAGAIN just means that your script have been waiting/listening to a socket for so long that the operating system have decided that you have waited long enough. It is one of the design faults in Unix. They moved a system problem into the user domain instead of solving it in the kernel. Greetings, This e-mail and any attachment are confidential and may be privileged or otherwise protected from disclosure. It is solely intended for the person(s) named above. If you are not the intended recipient, any reading, use, disclosure, copying or distribution of all or parts of this e-mail or associated attachments is strictly prohibited. If you are not an intended recipient, please notify the sender immediately by replying to this message or by telephone and delete this e-mail and any attachments permanently from your system. From Tarjei.Jensen@akerkvaerner.com Tue Dec 20 06:21:08 2005 From: Tarjei.Jensen@akerkvaerner.com (Tarjei.Jensen@akerkvaerner.com) Date: Tue, 20 Dec 2005 07:21:08 +0100 Subject: [Infrastructures] ISconf: Cache.py - bcast method Message-ID: Jordan Curzon wrote: > Keeps crashing on me on the last line of the bcast method in Cache.py. > It throws a socket.error exception with the EAGAIN error. I > did some looking and other places in the code ignore that > error. I trapped the exception and everything runs fine. > > Is that a bug or am I misunderstanding things. EAGAIN just means that your script have been waiting/listening to a socket for so long that the operating system have decided that you have waited long enough. It is one of the design faults in Unix. They moved a system problem into the user domain instead of solving it in the kernel. Greetings, This e-mail and any attachment are confidential and may be privileged or otherwise protected from disclosure. It is solely intended for the person(s) named above. If you are not the intended recipient, any reading, use, disclosure, copying or distribution of all or parts of this e-mail or associated attachments is strictly prohibited. If you are not an intended recipient, please notify the sender immediately by replying to this message or by telephone and delete this e-mail and any attachments permanently from your system. From ceri@submonkey.net Tue Dec 20 09:05:03 2005 From: ceri@submonkey.net (Ceri Davies) Date: Tue, 20 Dec 2005 09:05:03 +0000 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: References: Message-ID: <20051220090503.GJ63860@submonkey.net> --cDtQGJ/EJIRf/Cpq Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Dec 20, 2005 at 07:21:08AM +0100, Tarjei.Jensen@akerkvaerner.com wr= ote: >=20 > Jordan Curzon wrote: > > Keeps crashing on me on the last line of the bcast method in Cache.py. > > It throws a socket.error exception with the EAGAIN error. I > > did some looking and other places in the code ignore that > > error. I trapped the exception and everything runs fine. > > > > Is that a bug or am I misunderstanding things. >=20 > EAGAIN just means that your script have been waiting/listening to a > socket for so long that the operating system have decided that you have > waited long enough. Or more generally, that the operation failed due to a resource shortage, but might work if retried. > It is one of the design faults in Unix. They moved a system problem into > the user domain instead of solving it in the kernel. Allowing the user to decide what to do next makes perfect sense to me. The other option is to have a default action which we'd all be complaining about the existence of. :) Ceri --=20 Only two things are infinite, the universe and human stupidity, and I'm not sure about the former. -- Einstein (attrib.) --cDtQGJ/EJIRf/Cpq Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFDp8k/ocfcwTS3JF8RAlyEAJ48mkKCichBO2hM7588CHYhBJayVACfUEpS LCPLKW0eEos8IdFC7lUBC7c= =Ng9P -----END PGP SIGNATURE----- --cDtQGJ/EJIRf/Cpq-- From Daniel Hagerty Tue Dec 20 17:06:00 2005 From: Daniel Hagerty (Daniel Hagerty) Date: Tue, 20 Dec 2005 12:06:00 -0500 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> Message-ID: <17320.14840.502512.826882@perdition.linnaean.org> > Yes, from the sendto call. The machine is just for testing ISCONF. It > is a clean install of Ubuntu with simple routing (just gets a dhcp > address). I am not using the IS_NETS file and the address that fails > is . Also, the call succedes for some number of calls then > throws the exception. *Which* broadcast? 255.255.255.255, or subnet address all ones? Does your OS perchance have any kind of ratelimiting on broadcast traffic? From stevegt@TerraLuna.Org Wed Dec 21 03:06:09 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Tue, 20 Dec 2005 19:06:09 -0800 Subject: [Infrastructures] ISconf: Cache.py - bcast method In-Reply-To: <17320.14840.502512.826882@perdition.linnaean.org> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> <17320.14840.502512.826882@perdition.linnaean.org> Message-ID: <20051221030608.GA15226@terraluna.org> On Tue, Dec 20, 2005 at 12:06:00PM -0500, Daniel Hagerty wrote: > > Yes, from the sendto call. The machine is just for testing ISCONF. It > > is a clean install of Ubuntu with simple routing (just gets a dhcp > > address). I am not using the IS_NETS file and the address that fails > > is . Also, the call succedes for some number of calls then > > throws the exception. > > *Which* broadcast? 255.255.255.255, or subnet address all ones? In python, the string '' means INADDR_BROADCAST i.e. 255.255.255.255. (This was more foolproof than trying to figure out the local network numbers. If it turns out that this is somehow causing blocking on a UDP sendto(), then then I'm going to have to eat my hat or something.) Steve -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org From curzonj@gmail.com Wed Dec 21 12:09:17 2005 From: curzonj@gmail.com (Jordan Curzon) Date: Wed, 21 Dec 2005 05:09:17 -0700 Subject: [Infrastructures] RE: ISconf: Cache.py - bcast method In-Reply-To: <9d03aa20512210408l4a2e4810r6a744c120d59ca2b@mail.gmail.com> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> <20051220020928.GB14072@terraluna.org> <9d03aa20512210408l4a2e4810r6a744c120d59ca2b@mail.gmail.com> Message-ID: <9d03aa20512210409y609a28b0p40381a10e8bffeb6@mail.gmail.com> My python version is 2.4.1. I am testing it on an imaged XEN virtual machine. I haven't tested it on real hardware, or a different distro. I havn't had the change to test wrap ing it in blocking calls. On 12/19/05, Steve Traugott wrote: > Curiouser and curiouser. What happens if you set the socket to > blocking before the sendto() call, then back to nonblocking again > afterwards? I.E.: > > self.sock.setblocking(1) > self.sock.sendto(msg,0,(addr,self.udpport)) > self.sock.setblocking(0) > > Has this only happened on one machine? Out of how many? Does dmesg > say anything interesting (e.g. link state transients)? > > Steve > > On Mon, Dec 19, 2005 at 04:49:57PM -0700, Jordan Curzon wrote: > > Yes, from the sendto call. The machine is just for testing ISCONF. It > > is a clean install of Ubuntu with simple routing (just gets a dhcp > > address). I am not using the IS_NETS file and the address that fails > > is . Also, the call succedes for some number of calls then > > throws the exception. > > > > On 12/19/05, Steve Traugott wrote: > > > On Mon, Dec 19, 2005 at 07:49:16AM -0700, Jordan Curzon wrote: > > > > Keeps crashing on me on the last line of the bcast method > > > > in Cache.py. It throws a socket.error exception with the > > > > EAGAIN error. I did some looking and other places in the > > > > code ignore that error. I trapped the exception and > > > > everything runs fine. > > > > > > > > Is that a bug or am I misunderstanding things. > > > > > > You're getting an EAGAIN from a UDP sendto(), right? > > > Bizarre. That means that the operation would block (and I > > > have the socket set for non-blocking). Not sure what would > > > cause that in UDP. Without knowing what's causing this, I'm > > > not sure whether the correct action is to trap and discard > > > the exception, or to yield and retry later. Some ideas: > > > > > > - If you're using a nets file, check to make sure all of the > > > IP addresses in there are valid and routable. Check your > > > routing table as well... Add a debug() call to your patch > > > to show the address that's failing. > > > > > > - Check to see if there's something seriously wrong with the > > > IP stack on the machine -- running out of mbufs maybe? > > > What else is the machine doing? > > > > > > Anyone else have any ideas for what might cause a UDP > > > sendto() to block? > > > > > > I've created bug #63 to track this: http://trac.t7a.org/isconf/ticket/63 > > > > > > Steve > > > > > > -- > > > Stephen G. Traugott (KG6HDQ) > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > stevegt@TerraLuna.Org > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > > _______________________________________________ > > Infrastructures mailing list > > Infrastructures@mailman.terraluna.org > > http://mailman.terraluna.org/mailman/listinfo/infrastructures > > -- > Stephen G. Traugott (KG6HDQ) > UNIX/Linux Infrastructure Architect, TerraLuna LLC > stevegt@TerraLuna.Org > http://www.stevegt.com -- http://Infrastructures.Org > From stevegt@TerraLuna.Org Thu Dec 22 04:58:44 2005 From: stevegt@TerraLuna.Org (Steve Traugott) Date: Wed, 21 Dec 2005 20:58:44 -0800 Subject: [Infrastructures] RE: ISconf: Cache.py - bcast method In-Reply-To: <9d03aa20512210409y609a28b0p40381a10e8bffeb6@mail.gmail.com> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> <20051220020928.GB14072@terraluna.org> <9d03aa20512210408l4a2e4810r6a744c120d59ca2b@mail.gmail.com> <9d03aa20512210409y609a28b0p40381a10e8bffeb6@mail.gmail.com> Message-ID: <20051222045843.GB8612@terraluna.org> My isconf testing is all done in a Xen cluster, but only in domain 0 of each of several physical nodes, with no other guests running yet. (I haven't yet got the machinery in place to have the test cases automatically create and delete guest domains.) It would be interesting if this turned out to be a Xen-specific behavior. It actually wouldn't surprise me at all, as a matter of fact. The hypervisor might be unable to deliver the packet to the physical interface in this time slice. I'm not sure, but it would make sense for that to cause the guest system call to block/EAGAIN. You're running multiple guests in the same physical box, all doing network traffic at the same time, right? Steve On Wed, Dec 21, 2005 at 05:09:17AM -0700, Jordan Curzon wrote: > My python version is 2.4.1. > > I am testing it on an imaged XEN virtual machine. I haven't tested it > on real hardware, or a different distro. > > I havn't had the change to test wrap ing it in blocking calls. > > On 12/19/05, Steve Traugott wrote: > > Curiouser and curiouser. What happens if you set the socket to > > blocking before the sendto() call, then back to nonblocking again > > afterwards? I.E.: > > > > self.sock.setblocking(1) > > self.sock.sendto(msg,0,(addr,self.udpport)) > > self.sock.setblocking(0) > > > > Has this only happened on one machine? Out of how many? Does dmesg > > say anything interesting (e.g. link state transients)? > > > > Steve > > > > On Mon, Dec 19, 2005 at 04:49:57PM -0700, Jordan Curzon wrote: > > > Yes, from the sendto call. The machine is just for testing ISCONF. It > > > is a clean install of Ubuntu with simple routing (just gets a dhcp > > > address). I am not using the IS_NETS file and the address that fails > > > is . Also, the call succedes for some number of calls then > > > throws the exception. > > > > > > On 12/19/05, Steve Traugott wrote: > > > > On Mon, Dec 19, 2005 at 07:49:16AM -0700, Jordan Curzon wrote: > > > > > Keeps crashing on me on the last line of the bcast method > > > > > in Cache.py. It throws a socket.error exception with the > > > > > EAGAIN error. I did some looking and other places in the > > > > > code ignore that error. I trapped the exception and > > > > > everything runs fine. > > > > > > > > > > Is that a bug or am I misunderstanding things. > > > > > > > > You're getting an EAGAIN from a UDP sendto(), right? > > > > Bizarre. That means that the operation would block (and I > > > > have the socket set for non-blocking). Not sure what would > > > > cause that in UDP. Without knowing what's causing this, I'm > > > > not sure whether the correct action is to trap and discard > > > > the exception, or to yield and retry later. Some ideas: > > > > > > > > - If you're using a nets file, check to make sure all of the > > > > IP addresses in there are valid and routable. Check your > > > > routing table as well... Add a debug() call to your patch > > > > to show the address that's failing. > > > > > > > > - Check to see if there's something seriously wrong with the > > > > IP stack on the machine -- running out of mbufs maybe? > > > > What else is the machine doing? > > > > > > > > Anyone else have any ideas for what might cause a UDP > > > > sendto() to block? > > > > > > > > I've created bug #63 to track this: http://trac.t7a.org/isconf/ticket/63 > > > > > > > > Steve > > > > > > > > -- > > > > Stephen G. Traugott (KG6HDQ) > > > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > > > stevegt@TerraLuna.Org > > > > http://www.stevegt.com -- http://Infrastructures.Org > > > > > > > _______________________________________________ > > > Infrastructures mailing list > > > Infrastructures@mailman.terraluna.org > > > http://mailman.terraluna.org/mailman/listinfo/infrastructures > > > > -- > > Stephen G. Traugott (KG6HDQ) > > UNIX/Linux Infrastructure Architect, TerraLuna LLC > > stevegt@TerraLuna.Org > > http://www.stevegt.com -- http://Infrastructures.Org > > > _______________________________________________ > Infrastructures mailing list > Infrastructures@mailman.terraluna.org > http://mailman.terraluna.org/mailman/listinfo/infrastructures -- Stephen G. Traugott (KG6HDQ) UNIX/Linux Infrastructure Architect, TerraLuna LLC stevegt@TerraLuna.Org http://www.stevegt.com -- http://Infrastructures.Org From Daniel Hagerty Thu Dec 22 15:13:31 2005 From: Daniel Hagerty (Daniel Hagerty) Date: Thu, 22 Dec 2005 10:13:31 -0500 Subject: [Infrastructures] RE: ISconf: Cache.py - bcast method In-Reply-To: <20051222045843.GB8612@terraluna.org> References: <9d03aa20512190649i689cd228s9a8a1f0909989a21@mail.gmail.com> <20051219212809.GD9351@terraluna.org> <9d03aa20512191549s5b95d5fco3847a3a02c495131@mail.gmail.com> <20051220020928.GB14072@terraluna.org> <9d03aa20512210408l4a2e4810r6a744c120d59ca2b@mail.gmail.com> <9d03aa20512210409y609a28b0p40381a10e8bffeb6@mail.gmail.com> <20051222045843.GB8612@terraluna.org> Message-ID: <17322.49819.79681.285384@perdition.linnaean.org> > My isconf testing is all done in a Xen cluster, but only in domain 0 > of each of several physical nodes, with no other guests running yet. > (I haven't yet got the machinery in place to have the test cases > automatically create and delete guest domains.) > > It would be interesting if this turned out to be a Xen-specific > behavior. It actually wouldn't surprise me at all, as a matter of > fact. The hypervisor might be unable to deliver the packet to the > physical interface in this time slice. I'm not sure, but it would > make sense for that to cause the guest system call to block/EAGAIN. The OS in front of me doesn't do that, but yours may vary (I'm not looking at linux code).