Working with textfiles on the Linux shell

Linux turns more and more to a popular operating system for IT professional. One of the reasons for this movement are the server solutions. Stability and low resource consuming are some of the important characteristics for this choice. May you already played around with a Microsoft Server you will miss the graphical Desktop in a Linux Server. After a login into a Linux Server you just see the command prompt is waiting for your inputs.

In this short article I introduce you some helpful Linux programs to work with files on the command line. This allows you to gather information, for example from log files. Before I start I’d like to recommend you a simple and powerful editor named joe.

Ctrl + C – Abort the current editing of a file without saving changes
Ctrl + KX – Exit the current editing and save the file
Ctrl + KF – Find text in the current file
Ctrl + V – Paste clipboard into document (CMD + V for Mac)
Ctrl + Y – Delete current line where cursor is

To install joe on an Debian based Linux distribution you just need to type:

sudo apt-get install joe

1. When you need to find content in a huge text file GREP will be your best friend. GREP allows you to search for text pattern in files.

$ gerp <pattern> file.log
    -n : number of lines that matches
    -i : case insensitive
    -v : invert matches
    -E : extended regex
    -c : count number of matches
    -l : find filenames that matches the pattern

2. When you need to analyze network packages NGREP is the tool of your choice.

$ ngrep -I file.pcap
    -d : specify the network interface
    -i : case insensitive
    -x : print in alternate hexdump
    -t : print timestamp
    -I : read a pcap file

3. When you need to see the changes between two versions of a file, DIFF will do the job.

$ diff version1.txt version2.txt
    -a : add
    -c : change
    -d : delete
     # : line numbers
     < : file 1
     > : file 2

4. Sometimes it is necessary to give an order to the entries in a file. SORT is gonna to help you with this task.

$ sort file.log 
     -o : write the result to a file
     -r : reverse order
     -n : numerical sort
     -k : sort by column
     -c : check if orderd
     -u : sort and remove
     -f : ignore case
     -h : human sort

5. If you have to replace Strings inside of a huge text, like find and replace you can do that with SED, the stream editor.

$ sed s/regex/replace/g
     -s : search
     -g : replace
     -d : delete
     -w : append to file
     -e : execute command
     -n : suppress output

6. Parsing fields using delimiters in text files can done by using CUT.

 $ cut -d ":" -f 2 file.log
     -d : use the field delimiter
     -f : field numbers
     -c : specific characters position

7. The extraction of substrings who occurred just once in a text file you will reach with UNIQ.

$ uniq file.txt
     -c : count the numbers of duplicates
     -d : print duplicates
     -i : case insesitive

8.  AWK is a programming language consider to manipulate data.

$ awk {print $2} file.log

How to reduce the size of a PDF document

 When you own a big collection of PDF files the used storage space can increasing quite high. Sometimes I own PDF documents with more than 100 MB. Well nowadays this storage capacities are not a big issue. But if you want to backup those files to other mediums like USB pen drives or a DVD it would be great to reduce the file size of you PDF collection.

Long a go I worked with a little scrip that allowed me to reduce the file size of a PDF document significantly. This script called a interactive tool called PDF Sam with some command line parameters. Unfortunately many years ago the software PDF Sam become with this option commercial, so I was needed a new solution.

Before I go closer to my approach I will discuss some basic information about what happens in the background. As first, when your PDF blew up to a huge file is the reason because of the included graphics. If you scanned you handwritten notes to save them in one single archive you should be aware that every scan is a image file. By default the PDF processor already optimize those files. This is why the file size almost don’t get reduced when you try to compress them by a tool like zip.

Scanned images can optimized before to include them to a PDF document by a graphic tool like Gimp. Actions you can perform are reduce the image quality and increase the contrast. Specially for scanned handwritten notes are this steps important. If the contrast is very low and maybe you plan to print those documents, it could happens they are not readable. Another problem in this case is that you can’t apply a text search over the document. A solution to this problem is the usage of an OCR tool to transform text in images back to real text.

We resume shortly the previous minds. When we try to reduce the file size of a PDF we need to reduce the quality of the included images. This can be done by reducing the amount of dots per inch (dpi). Be aware that after the compression the image is still readable. As long you do not plan to do a high quality print like a magazine or a book, nothing will get affected.

When we wanna reduce plenty PDF files in a short time we can’t do all those actions by hand. For instance we need an automated solution. To reach the goal it is important that the tool we use support the command line. The we can create a simple batch job to perform the task without any hands on.

We have several options to optimize the images inside a PDF. If it is a great idea to perform all options, depend on the purpose of the usage.

  1. change the image file to the PNG format
  2. reduce the graphic dimensions to the real printable area
  3. reduce the DPI
  4. change the image color profile to gray-scale

As Ubuntu Linux user I have all of the things I need already together. And now comes the part that I explain you my well working solution.


GPL Ghostscript is used for PostScript/PDF preview and printing. Usually as a back-end to a program such as ghostview, it can display PostScript and PDF documents in an X11 environment.

If you don’t have Ghostscript installed on you system, you can do this very fast.

sudo apt-get update
sudo apt-get -y install ghostscript

 Before you execute any script or command be aware you do not overwrite with the output the existing files. In the case something get wrong you loose all originals to try other options. Before you start to try out anything backup your files or generate the compressed PDF in a separate folder.

gs -sDEVICE=pdfwrite \
   -dCompatibilityLevel=1.4 \
   -dPDFSETTINGS=/default \
   -dNOPAUSE -dQUIET -dBATCH -dDetectDuplicateImages \
   -dCompressFonts=true \
   -r150 \
   -sOutputFile=output.pdf input.pdf

The important parameter is r150, which reduce the output resolution to 150 dpi. In the manage you can check for more parameters to compress the result more stronger. The given command you are able to place in a script, were its surrounded by a FOR loop to fetch all PDF files in a directory, to write them reduced in another directory.

The command I used for a original file with 260 MB and 640 pages. After the operation was done the size got reduced to around 36 MB. The shrunken file is almost 7 times smaller than the original. A huge different. As you can see in the screenshot, the quality of the pictures is almost identical.

As alternative, in the case you won’t come closer to the command line there is a online PDF compression tool in German and English language for free use available.

PDF Workbench

Linux Systems have many powerful tools to deal with PDF documents. For example the Libreoffice Suite have a button where you can generate for every document a proper PDF file. But sometimes you wish to create a PDF in the printing dialog of any other application in your system. With the cups PDF print driver you enable this functionality on your system.

sudo apt-get install printer-driver-cups-pdf

As I already explained, OCR allows you to extract from graphics text to make a document searchable. When you need to work with this type of software be aware that the result is good, but you cant avoid mistakes. Even when you perform an OCR on a scanned book page, you will find several mistakes. OCRFeeder is a free and very powerful solution for Linux systems.

Another powerful helper is the tool PDF Arranger which allows you to add or remove pages to an existing PDF. You are also able to change the order of the pages.



Grazer Linux Tage 2022

Heimnetz ohne Werbung mit AdGuard auf dem RaspberryPI

Leider ist von Minute 1:00 bis 2:10 kein Tonmitschnitt vorhanden 🙁 – einfach überspringen

Es gibt viele Projekte die sich für einen Raspberry PI eignen. Aus eigener Anwendung zeige ich wie man im Heimnetzwerk das Tool AdGuard in einem Docker Container zum laufen bringt, um damit die Werbung für alle im Netwerk verbundenen Geräte abstellt.

In diesem kleinen Workshop geht es darum auf einem Raspberry PI 4 mit einem Ubuntu Server Docker zum Laufen zu bekommen. Das ist aber der einfachste Schritt, denn dann geht es ans Eingemachte und wir fühlen den Netzwerkmöglichkeiten von Docker ein wenig auf den Zahn. Ein bischen SSH und Shell, ganiert mit Routerkonfiguration und vielen kleine praktischen Tipps runden den Talk ab.


Network spy protection with AdGuard Home on a Raspberry Pi and Docker

Maybe you have bought you like me an Raspberry Pi4 with 4GB RAM and think about what nice things you could do with it. Since the beginning I got the idea to use it as an lightweight home server. Of course you can easily use a mini computer with more power and obviously more energy consumption too. Not a nice idea for a device is running 24/7. As long you don’t plan to mine your own bitcoins or host a high frequented shop system, a PI device should be sufficient.

I was wanted to increase the network security for my network. For this reason I found the application AdGuard which blocks many spy software from internet services you use on every device is connected to the network where AdGuard is running. Sounds great and is not so difficult to do. Let me share with you my experience.

As first let’s have a look to the overall system and perquisites. After the Router from my Internet Service Provider I connected direct by wire my own Network router an Archer C50. On my Rapsbery PI4 with 4GB RAM run as operation system Ubuntu Linux Server x64 (ARM Architecture). The memory card is a 64 GB ScanDisk Ultra. In the case you need a lot of storage you can connect an external SSD or HDD with an USB 3 – SATA adapter. Be aware that you use a storage is made for permanent usage. Western Digital for example have an label called NAS, which is made for this purpose. If you use standard desktop versions they could get broken quite soon. The PI is connected with the router direct by LAN cable.

The first step you need to do is to install on the Ubuntu the Docker service. this is a simple command: apt-get install docker. if you want to get rid of the sudo you need to add the user to the docker group and restart the docker service. If you want to get a bit more familiar with Docker you can check my video Docker basics in less than 10 minutes.

sudo apt-get install docker
sudo gpasswd -a <user> docker
sudo dockerd

After this is done you need to create a network where the AdGuard container is reachable from your router to a static IP address on your PI.

docker network create -d macvlan -o parent=eth0 \
--subnet= \
--ip-range= \
--gateway= \

Before you just copy and past the listing above, you need to change the IP addresses to the ones your network is using. for all the installation, this is the most difficult part. As first the network type we create is macvlan bounded to the network card eth0. eth0 is for the PI4 standard. The name of the network we gonna to create is lan. To get the correct values for subnet, ip-range and gateway you need to connect to your router administration.

To understand the settings, we need a bit of theory. But don’t worry is not much and not that complicated. Mostly your router is reachable by an IP address similar to – this is a static address and something equal we want to have for AdGuard on the PI. The PI itself is in my case reachable by, but this IP we can not use for AdGuard. The plan is to make the AdGuard web interface accessible by the IP OK let’s do it. First we have to switch on our router administration to the point DHCP settings. In the Screenshot you can see my configuration. After you changed your adaptions don’t forget to reboot the router to take affect of the changes.

I configured the dynamic IP range between to This means the first 4 numbers before can be used to connect devices with a static IP. Here we see also the entry for our default gateway. Whit this information we are able to return to our network configuration. the subnet IP is like the gateway just the digits in the last IP segment have to change to a zero. The IP range we had limited to the because is one number less than where we configured where the dynamic IP range started. That’s all we need to know to create our network in Docker on the PI.

Now we need to create in the home directory of our PI the places were AdGuard can store the configuration and the data. This you can do with a simple command in the ssh shell.

mkdir /home/ubuntu/adguard/work
mkdir /home/ubuntu/adguard/conf

As next we have to pull the official AdGuard container from the Docker Hub and create a image. This we do by just one command.

docker run -d --name adguard --restart=always \
-p 3000:3000/tcp --net lan --ip \
-p 53/tcp -p 53/udp -p 67/udp -p 68/udp -p 80/tcp \
-p 784/udp -p 8853/udp \
-p 443/tcp -p 443/udp \
-p 853/tcp -p 853/udp \
-p 5443/tcp -p 5443/udp \
-v /home/ubuntu/adguard/work:/opt/adguardhome/work \
-v /home/ubuntu/adguard/conf:/opt/adguardhome/conf \

The container we create is called adguard and we connect this image to our own created network lan with the IP address Then we have to open a lot of ports AdGuard need to do the job. And finally we connect the two volumes for the configuration and data directory inside of the container. As restart policy we set the container to always, this secure that the service is up again after the server or docker was rebooted.

After the execution of the docker run command you can reach the AdGuard configuration page with your browser under: Here you can create the primary setup to create a login user and so on. After the first setup you can reach the web interface by

The IP address you need now to past into the field DNS Server for the DHCP settings. Save the entries and restart your router to get all changes working. When the router is up open on your browser any web page from the internet to see that everything is working fine. After this you can login into the AdGuard web console to see if there appearing data on the dashboard. If this is happened then you are don e and your home or office network is protected.

If you think this article was helpful and you like it, you can support my work by sharing this post or leave a like. If you have some suggestions feel free to drop a comment.

Installing NextCloud with Docker on a Linux Server

For business it’s sometime important to have a central place where employees and clients are able to interact together. NextCloud is a simple and extendable PHP solution with a huge set of features you can host by yourself, to keep full control of your data. A classical Groupware ready for your own cloud.

If you want to install NextCloud on your own server you need as first a well working PHP installation with a HTTP Server like Apache. Also a Database Management System is mandatory. You can chose between MySQL, MariaDB and PostgreSQL servers. The classical way to install and configure all those components takes a lot of time and the maintenance is very difficult. To overcome all this we use a modern approach with the virtualization tool docker.

The system setup is as the following: Ubuntu x64 Server, PostgreSQL Database, pgAdmin DBMS Management and NextCloud.


  • Docker Basics
  • Installing Docker on a Ubuntu server
  • prepare your database
  • putting all together and make it run
  • insights to operate NextCloud

Docker Container Instructions

# create network
docker network create -d bridge --subnet= service

# postures database server
docker run -d --name postgres --restart=always \
--net service --ip \
-e PGPASSWORD=s3cr3t \
-v /home/ed/postgres/data:/var/lib/postgresql/data \

# copy files from container to host system
docker cp postgres:/var/lib/postgresql/data /home/ed/postgres

# pgAdmin administration tool
docker run -d --name pgadmin --restart=no \
-p 8004:80 --net services --ip \
-e \

# nextcloud container
docker run -d --name nextcloud --restart=always \
-p 8080:80 --net services --ip \
-v /home/ed/_TEMP_/nextcloud:/var/www/html \
-e POSTGRES_DB=nextcloud \
-e POSTGRES_USER=nextcloud \
-e POSTGRES_PASSWORD=nextcloud \


[1] Tutorial: Learn to walk with Docker and PostgreSQL
[2] Ubuntu Server:
[3] Docker :
[4] PostgreSQL
[5] pgAdmin
[6] NextCloud

If you have any question feel free to leave a comment. May you need help to install and operate your own NextCloud installation secure, don’t hesitate to contact us by the contact form. In the case you like the video level a thumbs up and share it.

Amazon affiliate links of my environment: