[rancid] Problem with some F5 devices

Alan McKinnon alan.mckinnon at gmail.com
Tue Dec 3 20:16:17 UTC 2013


Hi Michael,

All the info you've given here indicates that things are working
correctly. The f5rancid -d output with the "HIT COMMAND" sections
especially shows that data was collected and it's in a useable format -
the parser detected the prompt and then found the expected commands in
the expected order.

This is good news, as you've narrowed down considerably the piece of
code that contains your bug. Briefly, how rancid runs is:

- rancid-run is the script you launch
- rancid-run launches control_rancid for each group of devices in turn
- control_rancid launches par
- par runs PAR_CONT number of parallel sub-processes, one per device
- Each of those sub-processes starts rancid-fe which uses the device
type from router.db to start the appropriate rancid script (in your case
f5rancid)

- f5rancid runs clogin to fetch all the config info from the device,
usually it goes into a .raw disk file, but there is an option to use
pipes as well
- f5rancid then goes through that saved output line by line making sense
out of it, discarding unwanted text and writing the full desired output
to a .new file

The next bit is where I'm somewhat fuzzy (it's never failed me yet):

- the .new file is diff'ed with the previous fetched config, renamed and
booked into CVS and various mail notifications are generated and sent.



Your setup appears to be working correctly up to the point where a .new
file is generated, and everything else is common code. This doesn't
leave much to exmine, basically the last 20 lines of f5rancid after the
main loop labelled TOP.

I can't meaningfully help much further than this, I don't have any F5s
so I think you need to debug further by reading the code. How's your perl?








On 03/12/2013 19:44, Michael Sloan wrote:
> Thank you for the additional troubleshooting suggestions, although I'm not sure that I'm closer to a solution with this problem. I'll recap what I've learned from troubleshooting, and then show the file/screen output.
> 
> The troubleshooting/debugging recap:
> 
> Manually executing 'f5rancid <F5 device>' as the rancid user produces a <F5 device>.new file.
> Manually executing 'f5rancid <F5 vCMP>' as the rancid user produces a <F5 vCMP>.new file.
> 
> The f5rancid script first connects and determines the version of the F5 OS in use, and then initiates a second connection to the F5 or vCMP to issue the commands for the newer version of the F5 OS. If you run this second clogin command as the rancid user, you see all the correct screen output, but no file is created - this is true for both the F5 physical chassis and any vCMP. 
> 
> As far as I can see and tell, there aren't any differences in the behavior of the F5 chassis and the F5 vCMP, so I'm at a loss as to why the F5 chassis output files are created and the vCMP files are not.
> 
> 
> -----
> The troubleshooting/debugging information:
> 
> The screen output from "f5rancid -d <vCMP>':
> 
> -bash-3.1$ f5rancid -d 10.255.128.148
> executing clogin -t 90 -c "bigpipe version 2>&1" 10.255.128.148
> The F5 says to use tmsh, using tmsh command table for config collection.
> executing clogin -t 90 -c "tmsh show /sys version;tmsh show /sys hardware;tmsh show /sys license;cat /config/ZebOS.conf;lsof -i :179;tmsh show /net route static;tmsh -q list" 10.255.128.148
> PROMPT MATCH: \[root at test-prod2:/S1-green-P:Active:In Sync\] config #
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config #  tmsh show /sys version
>     In ShowVersion: [root at test-prod2:/S1-green-P:Active:In Sync] config #  tmsh show /sys version
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys hardware
>     In ShowHardware: [root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys hardware
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys license
>     In ShowLicense: [root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /sys license
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config # cat /config/ZebOS.conf
>     In ShowZebOSconf: [root at test-prod2:/S1-green-P:Active:In Sync] config # cat /config/ZebOS.conf
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config # lsof -i :179
>     In ShowZebOSsockets: [root at test-prod2:/S1-green-P:Active:In Sync] config # lsof -i :179
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /net route static
>     In ShowRouteStatic: [root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh show /net route static
> HIT COMMAND:[root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh -q list
>     In WriteTerm: [root at test-prod2:/S1-green-P:Active:In Sync] config # tmsh -q list
> 
> And the file 10.255.128.148.new is created (about 17k in size).
> 
> If you use clogin <vCMP> to connect to the device and try the commands listed in the second "executing clogin" sequence, several produce no output (for instance, 'tmsh show /net route static' - because there are no static routes), one produces an error message ('cat /config/ZebOS.conf') because the file doesn't exist anywhere on the vCMP filesystem or on the F5 physical chassis filesystem. The rest produce the expected output.
> 
> There are no error messages in the *.new output flle, aside from the 'file not found' error message from the above-mentioned 'cat' command. Both 'f5rancid <vCMP>' and 'f5rancid -d <vCMP>' produce vCMP.new files. The actual clogin command executed second:
> 
> clogin -t 90 -c "tmsh show /sys version;tmsh show /sys hardware;tmsh show /sys license;cat /config/ZebOS.conf;lsof -i :179;tmsh show /net route static;tmsh -q list" 10.255.128.148
> 
> produces no file on the RANCID server, even though the screen output displays the correct output. As an additional test, running that same clogin command on one of the physical chasses produces no file, although 'f5rancid <F5 chassis> does.
> 
> 
> Michael Sloan
> Systems Programmer Network Support
> Office: (850) 922-5476
> Northwood Shared Resource Center
> Michael.Sloan at nsrc.myflorida.com
> 
> 
> 
> -----Original Message-----
> From: rancid-discuss-bounces at shrubbery.net [mailto:rancid-discuss-bounces at shrubbery.net] On Behalf Of Alan McKinnon
> Sent: Monday, December 02, 2013 9:43 AM
> To: rancid-discuss at shrubbery.net
> Subject: Re: [rancid] Problem with some F5 devices
> 
> Your tests described below are quite sensible, but also incomplete
> 
> We know that clogin works on your f5 with a simple command We know that clogin works on a vCMP with a simple command We know that f5rancid works on your physical chassis
> 
> What we don't know is if clogin and f5rancid works correctly on a vCMP using the full command set. There must be some difference between what the physical chassis and the vCMPs sending back, otherwise both would work. I suspect some part of the vCMP output is upsetting the f5rancid script causing it to exit early.
> 
> You need the big troubleshooting guns (this process is almost always what you need to do anyway if adding a device to router.db doesn't work
> out):
> 
> 1. Run this test in a temp directory (not the usual rancid dir) as the rancid user 2. Pick a vCMP 3. Run "f5rancid -d <vCMP>"
> 4. This will give lots of screen output plus a new file with the full text output from the device in the current directory 5. In the screen output will be the full clogin command used. Copy paste that command and run it manually. Verify that the full command set works as expected on a vCMP 6. Look inside the raw data file from step 3. Somewhere near the end I expect to see error messages of some kind. Those errors will tell you were we look next.
> 
> Note that "missed cmd(s)" and "End of run not found" messages are useless for debugging purposes, they are catch-all output and only indicate that something went wrong. They give no clue as to why.
> 
> 
> 
> 
> 
> On 02/12/2013 15:49, Michael Sloan wrote:
>> I'm relatively new to using RANCID, although it has been in use for a 
>> couple of years in my (new) workplace. We have been using RANCID with 
>> Cisco and Juniper equipment, and I recently added some devices from 
>> Aruba and F5 to the list of devices being archived with RANCID.
>>
>>  
>>
>> We have 4 separate F5 chasses doing load-balancing and reverse proxy, 
>> and these work flawlessly with RANCID (once I found an F5 script that 
>> supports version 11 of the F5 OS, anyway). On these chasses, we have 
>> several vCMPs for different clients. The vCMPs have their own IP, and 
>> respond to the same F5 commands that the chasses do.
>>
>>  
>>
>> The files generated in the configs directory for the vCMPs are all 
>> zero-length files, even though the physical chasses produce 23k-47k 
>> files in the configs directory. I have verified that clogin works, and 
>> clogin -c "bigpipe version' <F5-vCMP> does in fact produce the correct 
>> output. Running "f5rancid <F5-vCMP>" produces a 17k file in a test 
>> directory, so I know the process works for the vCMPs (see directory 
>> listings below).
>>
>>  
>>
>> I have tried removing the entries for the vCMPs in router.db, started 
>> 'run-rancid', then added the entries back, and RANCID created 
>> zero-length files for the vCMPS a second time.
>>
>>  
>>
>> We are using RANCID 2.3.6, on a CentOS 6 system, with Expect 5.43
>>
>>  
>>
>> Has anyone encountered this problem or have any ideas how to resolve it?
>>
>>  
>>
>> A typical logfile:
>>
>>  
>>
>> Trying to get all of the configs.
>>
>> 10.255.128.146: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.145: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.147: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.148: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.152: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.151: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.153: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show 
>> /sys hardware
>>
>> 10.255.128.155: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.157: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.156: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.158: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.159: missed cmd(s): tmsh show /net route static
>>
>> Getting missed routers: round 4.
>>
>> 10.255.128.148: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.145: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.147: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.146: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.151: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.152: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.153: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.156: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.154: missed cmd(s): tmsh show /net route static,tmsh show 
>> /sys hardware
>>
>> 10.255.128.155: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.157: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.158: missed cmd(s): tmsh show /net route static
>>
>> 10.255.128.159: missed cmd(s): tmsh show /net route static
>>
>>  
>>
>> cvs diff: Diffing .
>>
>> cvs diff: Diffing configs
>>
>> cvs commit: Examining .
>>
>> cvs commit: Examining configs
>>
>> Checking in configs/10.255.128.143;
>>
>> /usr/local/rancid/var/CVS/other/configs/10.255.128.143,v  <--
>> 10.255.128.143
>>
>> new revision: 1.647; previous revision: 1.646
>>
>> done
>>
>> Checking in configs/10.255.128.144;
>>
>> /usr/local/rancid/var/CVS/other/configs/10.255.128.144,v  <--
>> 10.255.128.144
>>
>> new revision: 1.283; previous revision: 1.282
>>
>> done
>>
>>  
>>
>>  
>>
>> 10.255.128.145 and 10.255.128.146 are two of the physical chasses, 
>> while the IPs from .147 and above are vCMPs.
>>
>>  
>>
>> My router.db file:
>>
>>  
>>
>> 10.255.128.143:f5:up
>>
>> 10.255.128.144:f5:up
>>
>> 10.255.128.145:f5:up
>>
>> 10.255.128.146:f5:up
>>
>> 10.254.200.2:f5:up
>>
>> 10.255.128.147:f5:up
>>
>> 10.255.128.148:f5:up
>>
>> 10.255.128.151:f5:up
>>
>> 10.255.128.152:f5:up
>>
>> 10.255.128.153:f5:up
>>
>> 10.255.128.154:f5:up
>>
>> 10.255.128.155:f5:up
>>
>> 10.255.128.156:f5:up
>>
>> 10.255.128.157:f5:up
>>
>> 10.255.128.158:f5:up
>>
>> 10.255.128.159:f5:up
>>
>>  
>>
>> And lastly, the directory listing for the configs directory:
>>
>>  
>>
>> -bash-3.1$ ls -l
>>
>> total 592
>>
>> -rw-r----- 1 rancid netadm 470068 Dec  2 08:17 10.254.200.2
>>
>> -rw-r----- 1 rancid netadm  31335 Dec  2 08:17 10.255.128.143
>>
>> -rw-r----- 1 rancid netadm  27155 Dec  2 08:17 10.255.128.144
>>
>> -rw-r----- 1 rancid netadm  28406 Nov  5 09:33 10.255.128.145
>>
>> -rw-r----- 1 rancid netadm  23159 Nov  5 09:33 10.255.128.146
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.147
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.148
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.151
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.152
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.153
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.154
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.155
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.156
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.157
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.158
>>
>> -rw-r----- 1 rancid netadm      0 Nov 27 11:17 10.255.128.159
>>
>> drwxr-x--- 2 rancid netadm   4096 Dec  2 08:21 CVS
>>
>> -rw-r----- 1 rancid netadm  11256 Dec  2 08:18 wlc.nsrc.private
>>
>>  
>>
>> And my test from 'f5rancid 10.255.128.147' in a temp directory:
>>
>>  
>>
>> -bash-3.1$ ls -l
>>
>> total 20
>>
>> -rw-r--r-- 1 rancid netadm 17700 Dec  2 08:05 10.255.128.147.new
>>
>>  
>>
>>  
>>
>>  
>>
>> Michael Sloan
>>
>> Systems Programmer Network Support
>>
>> Office: (850) 922-5476
>>
>> Northwood Shared Resource Center
>>
>> Michael.Sloan at nsrc.myflorida.com 
>> <mailto:Michael.Sloan at nsrc.myflorida.com>
>>
>>  
>>
>>  
>>
>>
>>
>> _______________________________________________
>> Rancid-discuss mailing list
>> Rancid-discuss at shrubbery.net
>> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
>>
> 
> 
> --
> Alan McKinnon
> alan.mckinnon at gmail.com
> 
> _______________________________________________
> Rancid-discuss mailing list
> Rancid-discuss at shrubbery.net
> http://www.shrubbery.net/mailman/listinfo/rancid-discuss
> 


-- 
Alan McKinnon
alan.mckinnon at gmail.com



More information about the Rancid-discuss mailing list