Episode 011: du – disk usage

The du command provides a summary of disk usage for files and directories. The default behavior is to show the amount of blocks used by the contents of a directory or directories the command is run on. Usage is calculated recursively for directories. When du encounters a directory it will recurse into subdirectories and show the disk utilization of the files and directories under that directory and then present a total for the top most directory. This cascades down through each subdirectory where the subdirectory becomes the parent and each child directory is summarized and the parent then totaled. For instance, this screen shot below shows an example of the du command on a directory that contains subdirectories:

du screen shot 1

The default values are shown in units of 1024 bytes (1 kilobyte). This value can be adjusted using the -B or –block-size= option:

  • -BK -k –block-size=k, –block-size=1024 = display values in kilobytes (1024 bytes)
  • -BM -m –block-size=m, –block-size=1048576 = display values in megabytes
  • -BG –block-size=g = display values in gigabytes
  • -BT –block-size=t = display values in terabytes
  • -BP –block-size=p = display values in petabytes
  • -BE –block-size=e = display values in exabytes
  • -BZ –block-size=z = display values in zetabytes
  • -BY –block-size=y = display values in yottabytes

The actual use of these values varies depending no your system and storage capabilities. Trying to display values in zetabytes may produce the error:

-B arguemnt ‘Z’ too large

When you do not have a zetabye of space to begin with. Similarly, displaying values far greater that the amount of storage the object actually uses may report a value of 1. For instance, running:

du -BG some_file

Where “some_file” is less that 1 gigabyte will report a 1G for the object even though the file may actually be only a few kilobytes in size. So make an effort to stick with values that are reasonable for the file system you are reading. On newer versions of du there is the -h or –human-readable switch which will display the values of du to the closest 1024 or less unit representation. That is:

  • 1-1023 KB = kilobytes
  • 1024+ KB = megabytes
  • 1024+ MB = Gigabytes
  • 1024+ GB = Terabytes

The letter representing the unit displayed will be appended to the end of the amount.

The default behavior of du shows the totals for directories and not individual files. The -a or –all switch will report counts for all directories and files recursively.

The total usage is displayed on the last line of output by default. If you want to only see this amount instead of all the values for each directory or file the -s or –summarize is the flag to use:

du -sh

This will display the disk usage total for the current directory. If you are just looking at the values of a few files:

du -h file1 file2 file3

The values will be shown for each file but no total will be calculated. To generate the total in this case use the -c or –total options:

du -hc file1 file2 file3

This will display the usage for each file and then a total for all three files on the last line all in human readable units.

The total of du can report just the files in the current location and not the values of subdirectories with the -S or –separate-dirs option. This will report the du values normally but the total usage displayed on the last line will not include the values of any subdirectories in that location. For instance:

du -S somedir

In some dir are the following files and directories:

file1 5K
file2 8K
directory1/file1 10K
directory1/files2 30K

The resulting output would be

40K ./directory
13K .

instead of:

40K ./directory
53K .

If just `du somedir` was run.

Recursion depth can be controlled with the -d or –max-depth= flag. If you specify 0 as the depth this is the same as -s or –summarize. What this does is control what displayed, it does no alter the values. The list of values will only display a recursion up to the max depth. That is, if a max depth of 1 is specified it will only show the current directory and one level of subdirectories but it will report the usage values normally. If you have a directory that contains a child directory which has two child directories underneath it, only the first child directory will be reported in the output instead of all first child directory and then an entry for each directory under the child. But in either case, the total usage for the child directory will be reported as the same and the toal usage over all will be the same.

du screen shot 1

Compared with –max-depth=1

du screenshot 2

The –exclude= option exludes any directories or files matching the listed pattern:

du –exclude=”*.txt”

Would exclude any file with the “.txt” extension from be counted in the usage values. If you need to specify a number of different exclude rules put them in a text file and call that file with the -X or –exclude-from= flag:

du –exclude-from=excludefile

The output of du is a list with each entry separated by a new line. You can change the newline to a 0 byte using the -0 or –null option and this will output the values on a single line separated by a 0 byte.

The du command operates on blocks of storage not on the actual space used by the object. Blocks are representations on how data is stored on a storage device like a disk. Thus, this type of storage is called block storage. The disk is divided into partitions and the filesystem chosen formats the partition into usable blocks of storage. Files are written to the filesystem in blocks. You can see the block size set for the filesystem by running the appropriate tool for your filesystem. For ext filesystems run:

dumpe2fs /dev/### | grep “Block Size”

More than likely you will not be able to run the command as a normal user so run it as root or use the sudo command. A common value might be 4096b or 4k blocks being reported. Therefore, files are written out in 4K blocks on the filesystem. A file that is 1 byte or 4095 bytes will use the same block as only one file can be written to a single block. So a 4097 byte file will consume 2 blocks of disk storage for a total of 4097 out of 9192 bytes resulting in 4095 bytes of the second block not being used. There is an easy way to demonstrate this using the dd command:

dd if=/dev/zero of=dutest bs=4096 count=1

This will create a single file called dutest with a block size count of 4096 bytes. Issue the du command on this file:

du dutest

And the result will be 4, or 4K that is. Repeat the dd command to create two more files of different size:

dd if=/dev/zero of=dutest2 bs=4097 count=1
dd if=/dev/zero of=dutest3 bs=7000 count=1

Run the du command again on these three files:

du -h dutest*

Both dutest2 and dutest3 will show 8 or 8.0K as being used, even though these files are actually different sizes. This is because du report usage in blocks, not actual file size. To change this behavior you can use the –aparent-size switch:

du –apparent-size -h dutest*

The values reported are now more closely related to how much space the data actually consumes:

du screenshot 3 --aparent-size

Be aware of these difference when comparing the output of du with the results of other applications like ls or wc.

The du command has options for handling hard and symbolic links. By default du will not count multiple instances of a hard link and it will not dereference, or follow, symbolic link(s). The latter option is -P (or –no-dereference), but as it is the default, you probably will not need to use this flag. If you want to include symbolic links use the -L or –dereference flag, du will then follow symbolic links to their original files and include them in the value.

The -l or –count-links option will count multiple instances of a hard link each time an instance is encountered. If you had three hard links to a file du would only count this as one for original file hard linked to. With the -l flag each hard link would be counted in the total, and in this case would include those 3 hard links in the usage.

Aside from disk space usage the du command can show time related information about a file or directory. This information includes mtime, atime, and ctime:

  • mtime = modification time, the last time a file was modified
  • atime = access time – the last time a file was accessed or read
  • ctime = the last time the inode was changed

Note that when mtime changes so does ctime. But ctime is based on inode and inode holds information about a file that is not the file name or data and includes time values, permissions, ownership, etc. Therefore the ctime can change without altering the mtime if you run a command like chmod on the file. To view mtime information use the –time flag:

du –time

You can change time to a different value like this:

du –time=word

Where word is:

  • atime (or access, use)
  • ctime (or status)

How these times are displayed can be altered using the –time-style= switch with one of these values:

  • full-iso – default YYYY-MM-DD HH:MM
  • long-iso – YYYY-MM-DD
  • +FORMAT where FORMAT is interpreted like the date command

The latter option takes the date format as you would specify in the date command as the value. For instance, to display just the Year and Hour the command would be:

du –time –time-style=+”%H %M”

Note that the values are encased in double quotes. The double quotes are need because of the space. If a space was not used the double quotes could be left off:

du –time –time-style=+%H-%M

Bibliography

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much!

Posted in Uncategorized | 1 Comment

Episode 010 – df – Exploring Disk Filesystem Usage

The df command is used to report file system usage. The df command will show you the amount of storage available, used, and free per partition for each fileystem currently mounted on the system. Values are shown in blocks. The default output in 1 Kilobyte blocks and resembles the following:

output of df

The first column reports reports the block device or filesystem. Entries for for block devices are pretty straight forward. /dev/sda# is the first scsi-drive partition #. The filesystem entries may not be so evident.

  • rootfs is a special instance of ramfs (used for page cache and dentry cache).
  • dev – a psuedo file system holding the mount points for all the hardare devices identified by the kernel
  • run – a tmpfs holding application run time information
  • shm – tmpfs partition used for shared memory, passing data between programs.
  • tmpfs – temporary space allocated from ram to be used as temporary storage. Swapable

For more information on those filesystems, consult the links below.

The next three columns report the size of the filesystem, how storage is used, and how much is available in the specified format (1 kilobyte blocks by default). Use% show the percentage of storage used and the final column indicates where the filesystem is mounted.

You can restrict the output of df to a specific filesystem by specifying the path or fileystem device:

df /home

or

df /dev/sda6

df can also report on filesystems that may not have relevant information to list via the -a or –all switch:

Notice that a few more filesystems show up here than before: proc, sys, devpts, binfmt, none (usbfs), and gvfs-fuse-daemon.

  • proc is a psudeo filesystem that hold process information and the state of the kernel
  • sys is a synthetic filesystem that provides information about “the in-kernel status to userspace devices, busses, and loaded modules.”
  • devpts provides an interface to psuedo terminal (pty) devices. (xterm, konsole, etc.)
  • usbfs is the usb file system
  • gvfs-fuse-daemon: The userspace filesystem used by gnome.

The default block size of 1 Kilobyte can be changed using the -B switch and specifying one of the following:

  • -BK (-k) Kilobyte Blocks
  • -BM (-m) Megabyte Blocks
  • -BG (-g) Gigabyte Blocks
  • -BT Terrabyte Blocks
  • -BP Petabyte blocks
  • -BE Exabyte blocks
  • -BZ Zetabyte blocks
  • -BY Yottabyte blocks

Note that the first three you can specify by just using the switch in parenthesis. If you pass a value beyond the storage capacity of your filesystems you will get some unhelpful results. For instance

df -BE

May show each filesystem with 1 Exabyte block available, 1E used and 1E available. Or:

df -BZ

May report

df: -B argument ‘Z’ too large

The values listed above default to 1024 bytes. There are two options to calculate in powers of 1000 bytes:

  • -BKB 1000*1000 byte blocks
  • -BKM 1000*1000*1000 byte blocks

Some versions of df may support the -h or –human-readable switch. This will display the values of total blocks, Used, and Available using the smalled three digit value and block size possible. That is, instead of 1024 Kilobytes df -h would display 1 Megabyte.

There is also a human readable option for 1000 byte blocks which is specified by the -H or –si switch.

The -T or –print-type switch will display the filesystem type between the filesystem and the total blocks:

df showing file system types

Output of df can be restricted to only displaying certain filesystem types with the -t or –type= switch:

df -t ext4

Or filesystem can be exluded with the -x or –execlude-type= switch:

df -x ext4

The df command has the ability to display inodes instead of blocks using the -i or –inodes switch. An inode (index node) is a “data structure that stores all the information about a file sytem object.” That is for a file, directory, or device an inode is created to hold information about the object like permissions, size, timestamps, etc but not the file name or the data of the object. This switch and be used with the other switches to display value in specific ways like:

df -ih

Will show the inode information of the system in the human readable format.

By default df will show ALL mounted filesystems, even remote filesystems. The -l or –local switch restricts df to only showing local filesystems.

The –no-sync options is the default mode for df which does not perform a sync operation on the filesystems before reporting value. Therefore, if you have a mass storage device or content that has not been written to a disk, the df command man not reflect this information. A sync can be forced by issuing the –sync switch, although be prepared, in some cases it may take a while to sync the filesystems before the values can be reported.

The –total switch provides a total on the last line of each column’s values where relevant.

df showing total in human readable format

 

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much!

Posted in Uncategorized | Leave a comment

Episode 009: w – who is there? CPU Load Averages!

The w command show who is currently logged in to the system and from where. As Linux gains more and more desktop usage one might wonder how useful this command could be in an environment that particularly caters to a one person per use system. While w is critical in a server environment where you may have multiple system administrators performing various functions and a need to know who may be working on the machine before bringing down the server or a service, w still manages to display some very crucial information that is also valuable to the single desktop user environment.

 

The output of w:

w default output screen

Displays the following information in the header:

  • Time: 16:03:44
  • Uptime: 9 days, 19 hours 22 minutes
  • How many users are logged in: 8 users
  • Load Average: 1.05, 0.82, 0.69

This is followed by information on each person logged in and what process he or she is currently running:

  • Login name: dann
  • tty logged into: tty
  • remote host id:
  • Login time (when the user logged in) – 07Jun12
  • How long the user has been idle: 9days
  • JCPU : Total time used by user since login: 1:08
  • PCPU: CPU time used by the current running process: 0.00s
  • Command line of the current process: xinit …..

The header information is pretty straight forward except for load average. Load average is a very useful metric and appears not only in this command but also in top (which we will be covering in a future episode). Load average is broken into 3 values: Past 1 minute, past 5 minutes, and past 15 minutes. What this average relates is how often processes are waiting for CPU time.

Load average is often explained using a traffic analogy where the cpu is the highway and the load average is a value representing the cars waiting to get on the highway from an on ramp. When there are no cars on the on ramp, that means there is no backup and processes are not having to wait for cpu time. As CPU utilization increases more cars start to backup on the on ramp and thus load average increases. The more cars that are backed up, the greater the load average and the longer processes have to wait for CPU time.

Ideally load average should be below 1.0 per processor. If your load average is exactly  1 that means the cpu is running at full capacity with no processes waiting. Above 1 could indicate a problem and processes are having to wait for cpu time.

Be careful with how you interpret load averages, you cannot just rely on the load average of the past minute to indicate whether there is a problem or not. Hence the reason the past 5 minutes and past 15 minutes are also represented. The latter two values provide more of a current profile that just the past minute. This allows you to observe the performance of your system over a short period of time and can weed out false alarms.

Lets say you started a game or were compiling a program and saw your load average spike to over 1.35 the past minute. Chances are your 5 and 15 minute averages saw a more modest bump in load averages but nothing to indicate there is a probelm, after all you ran a pretty intensive task.

On the other hand, if you start seeing values over .70 in two or more of these columns over time, you may want to start ferreting out the culprit as this is a pretty good indication there is a problem. Again, be aware of what is going on. If you are rendering a movie, this may be normal load averages for such an intensive task. But if you are running a web server this could indicate your system resources are getting close to being taxed and you may want to start beefing up your machine.

Note that the emphasis was placed on 1 processor. Many systems run more than one processor or have multiple cores per processor. These extra processors need to be taken into account when observing load averages. Thus the traffic analogy continues and we can equate each processor to a lane of traffic. So now in a dual core system where you have the equivalent of 2 processors, a load average of 1 is not alarming. A load average of 1.7 should start to raise eyebrows and a load average of 2 means both processors are running processes to the max. Getting over 2.15 means you probably need to start paying attention to what is going on. Should this trend continue over a period of time and start to grow, you need to fix the problem fast.

To summarize with another way to look at load averages, if the load average is under 1.0 that means your processor is more than capable of handling the load you have been putting on it. At 1.0 means your CPU is maxed but handling the load as necessary. A load average of 2 means that ideally you would need at least twice the cpu capacity to handle the load average your system has been experiencing. At 5.0 you are 5 times above our CPU’s capacity. Again, this is a very rough analogy, not completely accurate, but close enough to give you and idea of what to look out for.

There are a few switches you can pass to the w command. The -h or –no-header option will display only the users and process information. The -s or –short displays only the header, users and processes. It omits the JCPU and PCPU information.

The -f or –from switch toggles on displaying the from remote host field.  This field will show up between the TTY and Login@ fields:

w command with --from option

Executing:

w username

Will show only the processes pertaining to the username specified. The header will remain unaltered.

 

Bibliography:

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much.

Clash Of Clans Cheat And Hack Tool

Posted in Uncategorized | 3 Comments

Episode 008 – free: Understanding Linux Memory Usage

The free command is a handy snapshot into your systems memory and how much of it is being used. In conjunction with other tools like top you can get begin to understand where your system resources are being utilized and weed out potential bottlenecks and bugs. But before jumping into the deep end in system analysis, you need to have a decent grasp on how the Linux kernel utilizes memory, or you initial observations may send you tearing through the interwebs looking for a solution to a problem that does not exist.

Invoked by itself free will show a standard output like this:

By default the values are reported in kilobytes. Depending on your version of free the following display options may be available:

  • -b – view in bytes (–bytes)
  • -k – view in kilobytes (default) (–kilo)
  • -m – view in megabytes (–mega)
  • -g – view in gigabytes (–giga)
  • –tera – view in terabytes
  • -h – view in human readable mode where values are “automatically scaled to the shortest three digit unit” (–human)

Depending on the age of your system, the last two options may not be available.

Displaying the above using the -h or human readable mode produces this:

A first glance at the first line might produce alarm! The system is reporting 1.5G of RAM with 1.4G being used leaving 74 megs of RAM available.  The current spec of this system running the free command is an Arch install using Fluxbox as the Window Manager, running Chromium (to write this post) with 5 tabs open, the Gimp with three images, Xchat, and Thunderbird. Not a lot going on there so why is the system using 1.4 GB of RAM? I thought Linux was easy on memory, especially when using a light weight window manager?

If I kill a few of these processes off, specifcally: Thunderbird, a few terminal windows, an errant sflphone and sflphone daemon process running from an iax connection a few days ago I bring my usage down a bit:

I now bring my usage down to 854 Megabytes of RAM with 647 Megabytes free according to the first line. That is still a lot of memory for a few applications. Where is the vaunted low footprint Linux is supposed to offer?

Just looking at the first line of free is not a clear picture of memory usage that one expects because it does not take into account how the memory is being utilized, particularly the cache buffers. That is where the second line comes in. The second line details the amount of memory being utilized by programs and not by the buffer cache. Hence the “-/+” which shows the amount of memory being utilized minus the buffer cache and the amount of free memory with the amount utilized by the buffer cache returned.

So what exactly is buffer cache? Buffer cache is an area of free physical memory utilized by the kernel to store data that has been recently read from a disk and/or utilized by a program, or that is ready to be written to the disk. The logic behind this utilization of memory is that reading from disk is slow compared to reading from memory. So if a bit of data is read from the disk there is a pretty good chance that data might be used again in the near future. So instead of just losing the data to the bit bucket, the data is stored in memory. This retrieval is far faster from memory than having to access it from disk again.

The same principle holds for writing to a disk. If a program writes data to disk it may be more efficient to cache this write in the buffer cache and allow the program to continue on instead of waiting for the data to be physically written to the disk. The kernel will handle writing the information to disk and the program can continue with what it was doing.

The buffer cache is dynamically allocated by the kernel and if you observe a running system over a period of time you will see the cache grow and shrink. You may even observe it consume over 90% of your memory as in the first screen shot above and become alarmed at such low resources available. But without taking the second line into consideration, you are missing the full picture.

As applications consume more memory, the kernel will begin dumping out any data from the buffer cache on a least used basis returning the cache to be used by actual applications that need it. Thus for all intents and purposes you can consider the memory used by the buffer cache to be “free” as you would expect to see when looking at how much memory is free in your system. To the kernel, though, it is memory in use, but a block of memory that will be immediately available to a program should it be required.

That being said, it is important to note that as applications consume more RAM, the performance of your system will begin to degrade after a certain point. If over 90% of your system memory is actually being used by applications this reduces the amount of memory available to the buffer cache so the performance boost the buffer cache offered in the past may not be available as physical RAM becomes consumed. Again, you can observe this by paying particular attention to the “-/+ buffers/cache” line.

Having a better understanding of the cache buffers will shed light on the rest of the output of free. The last three columns of the first line indicate the amount of memory being “shared,” utilized by the “buffers,” and utilized by the “cache.” The shared column is obsolete and can be ignored. But if you add up the amount of buffers and cached memory utilization and subtract it from the used column, you should get a value close to what is reported as used memory in the “-/+ buffers/cache” line. Or add it to the free value to get the free value of the second line.

The Swap line (3rd line of output) shows the amount of swap space being used by the system. This should be the value of your swap partition(s).

There are a few more options to the free command to outline. The first is the -t or –total options which displays a fourth line totaling the sum of the first line and the swap line of free for total, used, and free. This value will be displayed in bytes by default or in the unit value you specified:

free -mt

This would display the amounts in megabytes.

There is the -o or –old which displays the information of free in the “old” mode which essentially removes the buffers/cache line.

The -s or –seconds switch will display the output of free and then continue to display the output of free every n seconds specified:

free -h -s5

This will display the output of free in human readable format every 5 seconds until killed by ctrl-c.

The -s can be used with the -c or –count options which will display the output of free for n iterations with n seconds between each iteration:

free -m -s5 -c10

The output of free will be displayed every 5 seconds for 10 iterations. After the 10th output of free control will be returned to the shell and free will no longer be called. The -c option must be used in conjunction with the -s options.

Finally there is the –si option which will show the values in the power of 1000 as opposed to 1024.

The free command is a window into the performance of your system, but understanding how memory is utilized by the Linux kernel is important to determine the health of your system. By focusing on the amount of used and free memory in the first line of free, and not taking into account the amount of memory being used by the cache buffers, you may be setting yourself up for unnecessary concern.

Bibliography:


If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much!

Posted in Uncategorized | 4 Comments

Episode 007 – Chmod and Unix Permissions

Anyone using a Linux based system for some period of time will eventually have to deal with permissions. From setting a script executable to fixing group access to a directory, understanding the basics of Linux file permissions is a must have skill, and very simple to learn.

Linux and other Unix based systems have three simple permissions: Read (r), Write (w), and Execute (x). These permissions are granted on three levels: Owning Users (u), Owning Groups (g), and Others (o). You can easily see the permissions of any given file or directory by issuing:

ls -l file/directory_name

The output would be something like this:

-rw-r–r–   1   dann   users   59   Mar   20   10:19   error.log

The first column “-rw-r–r–” provides the permissions for the file. This error log has the following permissions:

  • read and write permissions for the owner (dann)
  • read permissions for the owning group (users)
  • read permissions for everyone else

Now if you look closely at this list of permissions you will see that there is an extra “-” before the three permission groupings. That first column indicates if the object is a special type of file like a directory or a character device.

File permissions on directories are pretty straight forward. Read access grants a user the ability to “read” the file. They can open the file and look at the contents of the file using a command like “cat” or opening the file in a text editor. The latter, though, does not grant write permissions. Write permission is an exclusive write and grants the ability to edit a file or even delete a file. Finally, execute permissions grants the ability to execute the file, to run it as a program or script. This later concept is important. If you create a shell script and attempt to run it like so:

./myshellscript.sh

It will more than likely fail claiming: “Permissions denied.” You must first grant executable permission to run this script (or execute it with a shell like sh or bash).

Directory permissions operate a bit differently than file permissions. The read permission grants the ability for a user to list files in a directory.  If that is the only permission you have in that directory you can list the files in that directory but you cannot view the contents of those files even if you have the proper permissions to do so. The execute permission on a directory provides the rights to view a file and to change directory (cd) into that directory. If you have execute right to a directory and not read, you can cd into the directory, you can access files by name in that directory, but you cannot attempt to list all files or access files by wildcards. Thus:

ls -l * or even ls -l

Will result in the “Permission denied” error if you do not have read access to that directory.

The write permission grants you the ability to alter or delete contents of that directory. Even if the contents on the directory do not grant you explicit write ability, you can delete the contents of that directory if you have the write permission to that directory. You will be asked if you want to “remove write-protected ….” but if you say yes, the file or object will be deleted.

There are three other permissions that are available:

  • Sticky bit
  • Set UID
  • Set GID

Sticky bit is a special permission is primarily for directory use and restricts the deletion of a file to allowable only by the owner of the file. This is commonly used for a /tmp directory where you want global read, write, and executable access but you do not want others to be able to delete files you own under that directory. If the sticky bit is set you will see it listed in place of the executable permission for the others directory permissions:

ls -ld /tmp

Reveals the following permissions for the /tmp directory:

rwxrwxrwt

The SetUID and SetGID bits operate similarly. SetUID runs or executes a file as the owners privileges. SetGID does the same thing but executes a file as the owning groups privilege level. For example, if you have a shell script owned by root that has the sticky bit set and you have executable permissions to that file, the script will run as root, not as your user when you execute it.

If the SetGID bit is set on a directory then files and subdirectories created under that parent directory will inherit the same group ownership as the parent directory. This mechanism exists to facilitate a shared directory where users that are in the same group can potentially read, write, and execute files in that directory without having to worry about changing permissions or ownership. On some systems setting the SetUID bit on a directory will cause files and directories created under the parent to retain the owning user as the parent directory. Note that in both intances, these ownerships will only be set for newly created files and directories after the SetGID or SetUID bits have been set. Existing files would have to be changed manually with the chmod command.

Understanding file permissions is only half the battle. Knowing how to change file permission is the next step. Changing file permission can be accomplished using the chmod command which stands for “change file mode bits.” The chmod command, at the very least, requires two parameters: The permissions to be set and what to set those permissions on:

chmod u=rwx,go=rx error.log

Executing this command will set the following permissions:

  • Owning User = Read, write, execute
  • Owning Group = Read, execute
  • All Others = Read, execute

There are two different ways to represent permissions: Symbolic and octal. The previous example is using symbolic representation which uses “r” to represent “read,” “w” to represent “write,” “x” to represent “execute,” “t” to represent “sticky bit,” and “s” to represent “setuid or setgid bit.” These permission are specified against “ugoa” whiich is:

  • u = owning user
  • g = owning group
  • o = all others
  • a = everyone

You can use chmod to set the permission for the owning user, owning group, and all others or specifically for one more more of the three sets. Furthermore you can explicitly set the permission or add and subtract permissions using “+/-“:

chmod u+w *.log
chmod u+x,go-x *.sh
chmod g+w *.txt

The first example adds the write permission for all files that end in “.log” in the current directory. It changes no other permission for the owner, owning group or all others. The second example adds the execute permission for the owning user if she does not have it, subtracts the executable permission for the owning group and all others if they have it for all files ending in “.sh” in the current directory. Finally the third example adds the write permission to all files ending in “.txt” in the current directory to the owning group.

The octal method for setting the permissions makes use of the octal value for each permission:

  • r = 4 or 0
  • w = 2 or 0
  • x = 1 or 0
  • s = 4 (for user) or 0 (or unspecified)
  • s = 2 (for group) or 0 (or unspecified)
  • t = 1 or 0 (or unspecified)

The summation of each permission, values 0-7, represent the permission for each of the “ugo.” The following examples show how to represent the permission values in octal format:

rwx = 7 (r+w+x = 4+2+1)
r-x =  5 (r+0+x = 4+0+1)
rw- = 6 (r+w+0 = 4+2+)
r– = 4 (r+0+0 = 4+0+0)

When specifying in octal you must explicitly set the values for all three otherwise you results apply to the right most group and work towards the left:

chmod 7 file = ——rwx on file
chmod 27 file = —-w-rwx on file

Be aware of this as it is always good form to specify the octal value for all three, “ugo.”

Specifying the special permissions: SetUID, SetGID, and sticky bit are options in octal format but when setting them they represent the left most column and add like the permission for ugo. To set the sticky bit on a directory you would issue:

chmod 1777 /var/www/groupsave

To set the SetUID on a script:

chmod 4755 somescript.sh

Finally, to add SetGID bit on a script:

chmod 2755 somescript.sh

The 4th left most bit, while reserves for special permissions, is optional to set. If you are not specifying any special permission you can leave this bit blank or set “0” in the 4th column:

chmod 0755 *.log

The chmod command has a few flags that can be passed. Probably one of the most frequently used flags is “-r” or “–recursive.” This flag will set the permissions for all files and directories and recurse setting the permission as such in sub-directories.

For safety purposes there is the “–preserve-root” which fails to operate recursively on  “/” or the root directory.  So issuing:

chmod -R –preserve-root *

While at the “/” or root directory will fail. This is not set by default so be careful. The options –no-preserve-root is default.

The -v or –verbose option will output the value for every file processed and report whether or not the permissions were changed and what they were changed to. By default only errors are reported and these can mostly be suppressed with the -f, –silet, or –quiet flag. Or you can take the middle road with the -c or –changes flag that reports only files that have permissions set/changed.

Finally there is the –reference=FILE which sets the permissions based upon the permissions of the FILE specified.  Thus if a file called happy has permissions 777 and you execute:

chmod –reference=~/happy *.log

All the files ending in “.log” in the current directory will have the same permissions as the file happy in your home directory, which in this case is 777 (rwxrwxrwx).

Understanding file permissions and how to manipulate them is key part of administering any Linux file system.

Bibliography:

  • man chmod
  • info file permissions

 

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much.

 

Watch Movie Online Logan (2017)

Posted in Uncategorized | 2 Comments

Episode 006 – pmount

Most desktop environments have a way to auto mount any device you connect to the system that is partitioned for storage. Plugging in a usb drive to a system running Gnome, KDE, or XfCE4 will generally offer to mount the device for you and make it available to your preferred file manager. But those who chose to run a lighter environment, say with a window manager like Fluxbox, WindowMaker, FVWM2 or run without X at all may not have the luxury of their devices auto-mounting in this way. You could work with editing your /etc/fstab to get the proper permissions for a user to mount devices like this or you could save yourself some time by using the pmount, or policy mount,  command.

Most distributions do not install pmount out of the box but it is a quick download form most package managers. Once installed there is some configuration that may need to be done before a use can utilize the pmount command.  The /etc/pmount.allow file must be created listing all the possible devices that will be mountable using pmount. This file contains of whitelist of devices that can be mounted. Pmount will mount removable devices that are not listed in this file, so depending on the device you want to use is detected pmount may or may not be allowed to mount the device unless it is whitelisted in /etc/pmount.allow.

To determine if a device is flagged as removable issue this command:

cat /sys/block/[device]/removable

Where [device] is the device on your system.  If the value is “1” then it is removable and does not need a whitelist entry. If it is 0 then it is not considered a removable device and must be whitelisted to be used with pmount.

Devices are specifed in the whilelist one per line.  Globs are allowed so instead of

sda
sda1
sda2
sda3

You can specify:

sda[123]

By default all devices are mounted under the /media directory by their given partition name.  So issuing:

pmount /dev/sdb1

Will mount parition 1 of device sdb on /media/sdb1. You can also pass a label to pmount and the partition or device will be mounted under that label:

pmount /dev/sdb clipzip

Will mount /dev/sdb under /media/clipzip instead of /media/sdb.

Successful mounting of a device by pmount occurs if the following conditions are met:

  1. The device is a block device (i.e.; hard drive, flash drive, dvd)
  2. The device is not already listed in fstab.
  3. The device is not already mounted according to mtab.
  4. If the mount point already exists it is not being used by another device.
  5. The device is removable or it is listed in the /etc/pmount.allow whitelist.
  6. Finally, the device will mount if it is not locked.

To find out what partitions are available for mounting when you plug a drive into your system you can issue the dmesg command which will show you the device name andthe partitions on the device.

If you issue the pmount command by itself it will show all the devices mounted by the pmount command (if there are any), just like issuing the mount command.

Once a device is mounted use the pumount command to umount it:

pumount /dev/sdb

There are a number of mount options you can pass to pmount, the list is as follows and includes the defaults:

  •  -r or –read-only – mount device as read-only.
  • -w or –read-write – mount device as read-write (default)
  • -s or –sync – device mounts without write caching (default is async)
  • -A or –noatime – mount device with noatime (default is atime)
  • -e or –exec – mount device with exec option (default is noexec).

If you are using a flash device you probably do not want to use the -s or –sync option as this turns off write caching and will increase the amount of time it takes to perform operations on the file system as every change will need to be written immediately. Although if you are prone to just pulling out usb devices before unmounting them  you may want to implement this option.

Pmount will autodetect the filesystem of the partition being mounted. Should it have difficulty for some reason you can pass the filesystem using the -t or –type flag. Syntax is the same as the mount command. Pmount supports many file systems but as of this entry btrfs is not supported.

The default umask for pmount is dependent on the filesystem.  Generally it is 022 which equates to the following permissions:  rw-r–r– (644) for files and rwxr-xr-x (755) for directories.  Remember the equation for this is:

files = 666 – umask
directories = 777 – umask

The default umask for VFAT and NTFS is 077 which creates files with rw——- (600) and directories rwx—— (700). Specifying the umask will only work with filesystems that will take a umask value, otherwise the value is ignored.

If you are attempting to mount an encrypted filesystem you will be prompted upon executing pmount for the passphrase.  You can specify a file to be used for this passphrase with the -p or –passphrase option.

Remember, to unmount a device mounted with pmount you issue the pumount command. Always unmount your device before physically removing it.

Pmount is a great utility to have on a desktop system but before installing it on a server, make sure you absolutely need this type of application. Allowing users to arbitrarily mount and unmount devices on a server is a security risk.

Bibliography:

  • man pmount
  • man umask
  • info umask

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much

 

Posted in Uncategorized | 3 Comments

Episode 005 -wc

Ever want to know how many lines are in a file? How about how many words are in a file or even how many characters? Well then the “wc” command is just for you.  The “wc” command, short for word count, is a very simple command that will print “new line, word and byte counts for file specified, and a total count for all files combined if more than one file is included.”Trik Android

Consider the following little ditty:

the linux wc command
for those not in then know
stands for word count and
does a lot you should know

It counts lines and words and bytes
producing output on site
quickly giving you the numbers
without any blunders

Executing the following command:

wc poem.txt

Results in the following output:

9 40 215 poem.txt

To break it down:

  • 9 lines
  • 40 words
  • 215 characters

You can generate these values independently using the following options:

  • -c or –bytes = number of bytes in the file(s)
  • -m or –chars = number of characters in the file(s)
  • -l or –lines = number of lines in the file(s)
  • -w or –words = number of words in the file(s)

In most cases -c and -m will produces the same output unless you are working in a multi-byte characters set and using multi-byte characters.

As hinted, the wc command can take multiple files and will even accept input from standard in.  This example is impractical but shows how you can pass standard in to wc:

cat poem.txt | wc
cat poem.txt | wc list.txt –

The first example the output of cat is piped to the wc command producing the same results listed above.  The second command pipes the output of cat to wc but we are also passing a file to wc at the same time so we must use a “-” to indicate that we want to include the output of the cat command.  If the list.txt file contained the following:

apples
dog food
sinkers
minnow trap
rope
milk

The output from the “cat poem.txt | wc list.txt -” would be:

 6     8    46    list.txt
9    40  215   –
15   48  261    total

There is an option in wc to list the character length longest line in a file or list of files:  -L or –max-line-length.  In our poem.txt example:

wc -L poem.txt

The following output is produced:

35 poem.txt

The final option to discuss in the wc command is –files0-from= which is an option to pass a list of files separated by an ascii null character to the wc command. The requirement here is that each file is separated by an ascii null character and nothing else. Simply passing a file containing each file you want to pass to the wc command will not work:

poem.txt
list.txt

This list will not work. You need to create the list with null characters instead of new lines, commas or anything else.  In vim you would have to enter the files like so:

poem.txt^@list.txt^@

You cannot simply enter a “^” and a “@” between each entry, this requires a special key input:  “ctrl-v 10”.  That is press ctrl-v and then enter 10. To then process this list with wc, assuming the file listing the files we want to process is called fileslist.txt:

wc –files0-from=fileslist.txt

Producing the following output:

 9     40     215     poem.txt
6      8        46    list.txt
15  48      261    total

You may also get an error:

wc:
: No such file or directory

If you read the info command for wc you will see an example of using the –files0-from= switch with the find command.

Bibliography:

  • man wc
  • info wc

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much

Posted in Uncategorized | Leave a comment

Episode 004 – paste

The paste command merges the lines of two or more files or a file and standard in if a second file is not specified or a “-” is used in place of the second file.  Consider the following two files.  The first file, test1.txt contains the following lines:

a
one
three
cat
good

The second file, test2.txt contains the following lines:

tuna
blue finch
dogs
fish
eats

The paste command can be used to paste these two files like so:

paste test1.txt test2.txt

producing the following output:

a         tuna
one     blue finch
three   dogs
cat      fish
good   eats

Each line in test1.txt has been “pasted” to the corresponding line in test2.txt.  The default delimiter to “paste” each line together with is TAB.  “The delimiter can be specified with the -d, or –delimiters flag:

paste –delimiters=, test1.txt test2.txt

Produces:

a,tuna
one,blue finch
three,dogs
cat,fish
good,eats

The –delimiters flag takes a list of delimiters to be applied to each pasted file consecutively.  If you execute:

paste –d ” ,1″ test1.txt test2.txt

(NOTE:   There is a space after the ” and before the “,”.) All you will get is each line in test1.txt pasted next to the corresponding line in test2.txt separated by a ” “. The “,” and the “1” are not used at all in this example because we are only joining two files.  If four files were being joined:

paste –d ” ,1″ test1.txt test2.txt test1.txt test2.txt

The following output would appear:

a tuna,a1tuna
one blue finch,one1blue finch
three dogs,three1dogs
cat fish,cat1fish
good eats,good1eats

The final flag that can be passed to the paste command is the -s or –serial option.  Instead of pasting the line from each file next to one another all the lines from each file are printed out in a columnar format with each file on it’s own separate line:

paste -s test1.txt test2.txt

The output is:

a        one    three    cat      good
tuna    blue finch      dogs    fish    eats

The -d flag could be used in conjunction with the -s:

paste -d, -s test1.txt test2.txt

Produces:

a,one,three,cat,good
tuna,blue finch,dogs,fish,eats

The paste command can use standard in as one of the input files with the “-“:

echo -e “a\nand a\nnow a\nhere”  |paste – test1.txt

Produces:

a          a
and a    one
and a    three
here     cat
good

In this example the -e option to echo enables the interpretation of backslash escapes so that the \n is translated to a new line when passed to the paste command through the pipe.  Furthermore the total number of lines in the echo command is one less that the lines in test1.txt.  The final line of test1.txt is not paired with a corresponding entry and is printed by itself.

Bibliography:

  • man paste
  • info paste

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much.

  • OGG Audio (from HPR)
  • Speex Audio (from HPR)
  • MP3 Audio (from HPR)
Posted in Uncategorized | 1 Comment

Episode 003 – cut

The cut command, as the man page states, “removes sections from each line of a file.”  The cut command can also be used on a stream and it can do more than just remove section.  If a file is not specified or “-” is used  the cut command takes input from standard in.  The cut command can be used to extract sections from a file or stream based upon a specific criteria.  An example of this would be cutting specific fields from a csv (comma separated values) file.  For instance, cut can be used to extract the name and email address from a csv file with the following content:

id, date, username, first name, last name, email address, phone, fax
1,2012-01-01,franklinf, Ford, Franklin, ff@gmail.com, 7575551212, 7775551234
2,2012-02-01,levona, Allan, Levon, allanl@tllts.org, 3177771212,
3,2012-02-17,mannyt,  Trish, Manny, tmanny@hpr.org,7275551212,8885551236

The syntax for cut would be:

cut -d”,” -f4,5,6 users.csv

The result would be displayed on standard out:

first name, last name, email address
Ford, Franklin, ff@gmail.ccom
Allan, Levon, allanl@tllts.org
Trish, Manny, tmanny@hpr.org

The -d option specifies the delimiter which is defaults to a TAB.   In the example above the cut command will “cut” the line at each “,” instead of a TAB.  The -f option indicates which fields to select, in this case fields 4, 5, and 6 which correspond to “first name,” “last name,” and “email address.”

The cut command can operate on fields, characters or bytes and must include one and only one of these options.

The field option operates on the cuts defined by the delimiter (-d), which is TAB by default.  The -d option can only be used with the field option.  Attempting to use the -d option with the character (-c) or bytes (-b) options will result in an error.  The -f value can be a command separated list or a range separated by a “-“:

cut -d”,” -f 1,2,3,4
cut -d”,” -f 1-4
cut -f 1-4,7,9
cut -d”,” -f -7
cut -d”,” -f 7-

Specifying a rang “-#” will display the first field to the seventh field.  The last entry will display fields 7 and the remaining fields until the end of the line.

The -f operator will also print lines that do not contain the delimiter character.  For instance in the example above, if a line was added to the end of the file producing:

id, date, username, first name, last name, email address, phone, fax
1,2012-01-01,franklinf, Ford, Franklin, ff@gmail.com, 7575551212, 7775551234
2,2012-02-01,levona, Allan, Levon, allanl@tllts.org, 3177771212,
3,2012-02-17,mannyt,  Trish, Manny, tmanny@hpr.org,7275551212,8885551236
this is a line without the delimiter

Executing:

cut -d”,” -f4,5,6 users.csv

Would produce the following output:

first name, last name, email address
Ford, Franklin, ff@gmail.ccom
Allan, Levon, allanl@tllts.org
Trish, Manny, tmanny@hpr.org
this is a line without the delimiter

To prevent the -f option from printing lines that do not contain the delimiter use the –only-delimited, or -s, option.

cut -d”,” -f4,5,6 -s users.csv

The other two “field” options do not work with delimiter: -c and -b.  The –character or -c option works on columns.  The man and info pages refer to the -c option as working on characters but many other references refer columns.  Technically cut references characters but considering the output of cut works on “fields” or “columns” of data one can think of each character in a line as a column.  Thus, the delimiter in this case is each individual character.  The values passed to -c must be list of digits separated by commas or a range:

echo “here is a line of text” | cut -c “1,2,3,4”
echo “here is a line of text” |cut -c “1-4”

Both examples produce the same output:

here

Where as:

echo “here is a line of text” |cut -c “6-“

Would produce:

is a line of text

Recall that specifying a value “#-” outputs from # to the end of the line.

The cut command can also work with bytes using the -b option and specifying a byte, or range of bytes like you would a field or character:

echo “here is a line of text” |cut -b “1-6”

Would produce the following out put:

here i

Note that the result is the same as specifying cut -c “1-6”.  In most cases you will be working with a single byte characters set, and more than likely you will never need to worry about multi-byte characters.  Thus, each character is a single byte.

Cut takes allow for a few more flags to control the output.  When discussing the -f, or field, option the -s, –only-delimited, flag was mentioned.  The -s flag suppresses the output of lines not containing the delimiter.

Cut will produce the complement of the standard output when –complement is used.  That is, it will output the opposite of what is normally generated from the cut command:

echo “here is a line of text” |cut -b “9-“

Produces everything from the ninth byte to the end of the line:

a line of text

Whereas:

echo “here is a line of text” | cut -b “9-” –complement

Produces the complement which is bytes 0  to 8:

here is

Note that the space between “is” and “a” is included in this output even though it is not easy to show in the example.

Finally, the flag –output-delimiter=STRING will allow you to change the output delimiter to something else:

echo “1:2:3:4:5:6:7″ | cut -d”:” -f “2-5”

Will produce the following output:

2:3:4:5

But the “:” output delimiter can be altered with –output-delimiter:

echo “1:2:3:4:5:6:7″ |cut -d”:” -f “2-5″  /
–output-delimiter=”,”

Producing:

2,3,4,5

 

Bibliography:

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much

 

Posted in Uncategorized | 1 Comment

Episode 002 – tr

The tr, or translate (aka: transliterate) command, substitutes one more characters for another set of characters or it will delete a specified set of characters.  The tr command takes input from standard in and writes to standard out.  This simple example of the tr command translates some numbers into a world:

echo “12234” |tr ‘1234’ ‘aple’

The output:

apple

The tr command has “translated” each character in set one ‘1 2 3 4’ to its corresponding character in set 2 ‘a p l e’ and produced the word “apple” to standard out.

Sets can be defined with short hand.  For instance, in the above set ‘1234’ could be replaced with ‘1-4’ to achieve the same results:

echo “12234” |tr ‘1-4’ ‘aple’

Take note that unlike in other programs or applications a range is not specified withing brackets (i.e.; [1-4[).  Placing a range within brackets may not produce the desired results.

This also poses a problem when there is a need to translate “-” to something else or strip it out.  Used at the start of the set, tr -ap1, would treat the “-” as a flag, “-a” in this case. Used within a set, tr -ac-k1, the “-” is treated as the range “c-k.”  Therefore, if you need to translate the “-” put it at the end of the set:

echo “hello-hooray-let-the-show-begin” | tr – ‘ ‘

Results in the output:  “hello hooray let the show begin” where the “-” is replaced by ” ” a single space.

Characters not specified in standard in will not be translated:

echo “12234” | tr ‘1-5’ ‘aples’

Since there is no ‘5’ from standard in there will be no resulting translation of “5” and thus no “s” will appear in standard out.

Characters in the first set that appear mulitple times will be translated to the last corresponding character in set 2 accorddingly:

echo “apple” | tr ‘caapa’ ‘12345’

The output is ‘544le’ with the matches in set 1 of a and p being translated to 4 and 5 accordingly.  The ‘a’ appears three times in set 1 being assigned to 2,3, and 5 but the tr command only uses the last entry so ‘a’ is set to ‘5.’

The size of each set is important and depending on what version of tr is installed on your system you may get different results if the sets are not of equal size:

echo 12345 | tr ‘12345’ ‘abc’

In this example if you were running the BSD version of tar the resulting output would be ‘abccc’ where the last character in set 2 was repeated until it matched the number of characters in set 1.  In the system V version of tr would truncate set 1 to the lenght of set 2 and the output would be ‘abc45.’  GNU tr handles these cases like BSD tr unless the -t (truncate  set 1) flag is used.

Set 2 can contain the short form for repeat strings to flush out the size of set 2 to the size of set 1.  The syntax for the repeat string is:  [n*] where n is some character:

echo 12345 | tr ‘12345’ ‘[a*]bc’

Results in ‘aaabc’ and:

echo 12345 | tr ‘12345’ ‘ab[c*]’

Results in ‘abccc.’

Defined classes can be used instead of specifying individual characters in a set.  For example, the set of all characters in the english alphabet can be defined by the class [:alpha:] and the set of all numbers 0-9 is defined by the class [:digits:].

echo “apple345 | tr [:alpha:] ‘1’

Results in:  ‘11111345’ while:

echo “apple245! | tr [:digit:] ‘abc’

Results in: ‘appleccc!’  Notice an exclamation point was tacked on the end there.  That is not in the [:alpha:] nor [:digig:] classes.  You might wonder why the 345 was converted to only 3 ‘c’s’.  Recall that in GNU tr like BSD tr set2 will repeat the last character to expand to set 1.  So 0 = a, 1 = b, and 2 through 9 = c.

A full list of character sets can be found in the man or info pages for tr.  Some of the more commons sets to use are:

  • [:alpha:] – all alphabet characters
  • [:digit:] – digits 0-9
  • [:alnum:] – a-z0-9
  • [:punct:] – punctuation characters
  • [:lower:] – lower case letters (a-z)
  • [:upper:] – upper case letters (A-Z)
  • [:blank:] and [:space:] – horizontal and vertical whitespace

In most cases character classes [:lower:] and [:upper:] are the only two  classes allowed in set 2 so long as their corresponding class is used respectively as set 1 (e.g.:  tr [:upper:] [:lower:]).   The exception to this rule is when using the –delete or —squeeze-repeats option (covered shortly).

The tr command accepts the following flags:

  • -c, -C, –complement  – first complment set 1
  • -d, –delete – delete characters in set 1
  • -s, –squeeze-repeats – replace repeated characters in set1 with single occurance of that character
  • -t, –truncate-set1 – first truncate set 1 to length of set 2

The -t, or truncate set  1, option reversed the way GNU tr handles set 2 when it is smaller than set 1.   The default behavior is to repeat the last character in set 2 for each corresponding character in set 1 beyond the size of set 2.  The -t option truncates set 1 to the size of set 2 so characters in set 1 with no corresponding complement in set 2 would be ignored:

echo “123456” | tr -t ‘123456’ ‘abc’

Would produce the output:  ‘abc456’ where as without the -t flag the output would be ‘abcccc’ as ‘456’ in set 1 would be matched to the ‘c’ in set 2.

The -d, or –delete, does not translate but deletes characters and does not accept a set 2.  If you try to pass a set 2 it will produce an error.  Set 1 can consist of characters or classes:

echo “123apple45” | tr -d ‘12345’

echo “123apple45” | tr -d [:digits:]

Both produce the same output:  ‘apple’ stripping the numbers ‘123’ and ’45’ from the output.

The squeeze, -s, flag is used to replace repeating characters specified in set 1 if no translation or deletion is to occur.  If translation or deletion is to occur then set 2 is used and squeeze occurs after the translation or deletion of characters specified in set 1.

echo “apple123455p” | tr -s ‘ap5’

In this example tr would replace any repeating instances of a, p, or 5 with a single instance of the respective character.  The result is ‘aple12345p’ where the second ‘p’ and second ‘5’ were “squeezed” out but not the first and only occurrence of ‘a” as that  is merely a single character.  The last ‘p’ is not stripped out because it occurs only by itself.

The -c, -C, or –complement flag replaces the complement, those characters not in the stream, of set 1 with the characters in set 2.  For example:

echo “12345apple” |tr -c [:alpha:] ‘s’

The output of tr is:  ‘sssssapple’ replacing all the characters not in the [:alpha:] set with ‘s’ – ‘12345’.  The characters ‘apple’ are ignored because they are in the [:alpha:] set.

Grasping the basics of the tr command will allow you to chain flags together to produce more comlex results.  A common example of this is to get all the words in a text file:

tr -cs ‘[:alpha:]’ ‘[\n*]’ < somefile.txt

This example makes use of the –complement and –squeeze flags.  Standard in is the file somefile.txt and that is passed to the tr command.  The tr command first takes the complement of set 1 [:alpha:] which is all digits, spaces, punctuation, etc; characters that are not in the alphabet (a-z) and replaces them with the \n, or newline character.  The asterisk is required in set 2 to repeat the \n in set 2 so the length equals that of set 1.  The –squeeze option then squeezes out extra \n’s in the output so that there is only one newline after each word.  Recall that when –squeeze is used in conjunction with another flag, the set 1 is processed by the first flag and –squeeze uses the values in set 2.  The result is a list of words or letters to standard out.

echo “here is a simple line!  Only 10 words long b.” | /

tr -cs ‘[:alpha:]’ ‘[\n*]’

The output of this command is a string of the words and letters:

  • here
  • is
  • a
  • simple
  • line
  • only
  • words
  • long
  • b

The !, 10, ., and each space was stripped out and replaced with a newline character.

One of the more handy uses of the tr command is to replace dos line breaks in text files with the standard new line:

tr -d ‘\r’ < somefiledos.txt > somefileproper.txt

This example strips out or deletes the dos end of line character \r.  If you ended with a file that was all one continues line of text you may need to replace as oppose to delete:

tr ‘\r’ ‘\n’ < somefiledos.txt > somefileproper.txt

For more on the tr command consult the links below and also be sure to check out the man and info pages on your current system.

Bibliography

If the video is not clear enough view it off the YouTube website and select size 2 or full screen.  Or download the video in Ogg Theora format:

Thank you very much

 

Posted in Uncategorized | 4 Comments