Advanced Programming in UNIX Environment Episode 28

Command-Line Arguments

When a program is executed,the process that does the exec can pass command-line arguments to the new program. This is part of the normal operation of the UNIX system shells.

#include "apue.h"

int main(int argc,char *argv[])
{
    int i;

    for(i=0;i<argc;i++)
    {
        printf("argv[%d]:%s\n",i,argv[i]);
    }
    exit(0);
}

Echo all command-line arguments to standard output

We are guaranteed by both ISO C and POSIX.1 that argv[argc] is a null pointer. This lets us alternatively code the argument-processing loop as

for (i = 0; argv[i] != NULL; i++)

Environment List

Each program is also passed an environment list. Like the argument list,the environment list is an array of character pointers,with each pointer containing the address of a null-terminated C string. The address of the array of pointers is contained in the global variable environ:

extern char **environ;

By convention,the environment consists of

name=value

strings. Most predefined names are entirely uppercase,but this is only a convention.

Because ISO C specifies that the main function be written with two arguments,and because this third argument provides no benefit over the global variable environ,POSIX.1 specifies that environ should be used instead of the (possible) third argument. Access to specific environment variables is normally through the getenv and putenv functions,described in Section 7.9,instead of through the environ variable.

Memory Layout of a C Program

  • Text segment,consisting of the machine instructions that the CPU executes.
  • Initialized data segment,usually called simply the data segment,containing variables that are specifically initialized in the program.
  • Uninitialized data segment,often called the ‘bss’ segment,named after an ancient assembler operator that stood for ‘block started by symbol.’
  • Stack,where automatic variables are stored,along with information that is saved each time a function is called.
  • Heap,where dynamic memory allocation usually takes place.

With Linux on a 32-bit Intel x86 processor,the text segment starts at location 0x08048000,and the bottom of the stack starts just below 0xC0000000. (The stack grows from higher-numbered addresses to lower-numbered addresses on this particular architecture.)

Shared Libraries

Most UNIX systems today support shared libraries.

Different systems provide different ways for a program to say that it wants to use or not use the shared libraries.

Memory Allocation

ISO C specifies three functions for memory allocation:

1.malloc,which allocates a specified number of bytes of memory. The initial
value of the memory is indeterminate.
2.calloc,which allocates space for a specified number of objects of a specified
size. The space is initialized to all 0 bits.
3.realloc,which increases or decreases the size of a previously allocated area. When the size increases,it may involve moving the previously allocated area somewhere else,to provide the additional room at the end. Also,when the size increases,the initial value of the space between the old contents and the end of the new area is indeterminate.

#include <stdlib.h>

void *malloc(size_t size);
void *calloc(size_t nobj,size_t size);
void *realloc(void *ptr,size_t newsize);

void free(void *ptr);

The pointer returned by the three allocation functions is guaranteed to be suitably aligned so that it can be used for any data object.

Alternate Memory Allocators

libmalloc
vmalloc
quick-fit
jemalloc
TCMalloc
alloca Function

Environment Variables

As we mentioned earlier,the environment strings are usually of the form

name=value

The UNIX kernel never looks at these strings; their interpretation is up to the various applications. The shells,for example,use numerous environment variables. Some,such as HOME and USER,are set automatically at login; others are left for us to set. We normally set environment variables in a shell start-up file to control the shell’s actions.

ISO C defines a function that we can use to fetch values from the environment,but this standard says that the contents of the environment are implementation defined.

#include <stdlib.h>

char *getenv(const char *name);

Note that this function returns a pointer to the value of a name=value string. We should always use getenv to fetch a specific value from the environment,instead of accessing environ directly.

In addition to fetching the value of an environment variable,sometimes we may want to set an environment variable. We may want to change the value of an existing variable or add a new variable to the environment.

The clearenv function is not part of the Single UNIX Specification. It is used to remove all entries from the environment list.

The prototypes for the the environmental functions are

#include <stdlib.h>

int putenv(char *str);

int setenv(const char *name,const char *value,int rewrite);
int unsetenv(const char *name);

The operation of these three functions is as follows:

• The putenv function takes a string of the form name=value and places it in the environment list. If name already exists,its old definition is first removed.
• The setenv function sets name to value. If name already exists in the
environment,then (a) if rewrite is nonzero,the existing definition for name is first removed; or (b) if rewrite is 0,an existing definition for name is not removed,name is not set to the new value,and no error occurs.
• The unsetenv function removes any definition of name. It is not an error if such a definition does not exist.

Note the difference between putenv and setenv. Whereas setenv must allocate memory to create the name=value string from its arguments,putenv is free to place the string passed to it directly into the environment. Indeed,many implementations do exactly this,so it would be an error to pass putenv a string allocated on the stack,since the memory would be reused after we return from the current function.

1.If we’re modifying an existing name:

a. If the size of the new value is less than or equal to the size of the existing value,we can just copy the new string over the old string.

b. If the size of the new value is larger than the old one,however,we must malloc to obtain room for the new string,copy the new string to this area,and then replace the old pointer in the environment list for name with the pointer to this allocated area.

2.If we’re adding a new name,it’s more complicated. First,we have to call malloc to allocate room for the name=value string and copy the string to this area.

a. Then,if it’s the first time we’ve added a new name,we have to call malloc to obtain room for a new list of pointers. We copy the old environment list to this new area and store a pointer to the name=value string at the end of this list of pointers. We also store a null pointer at the end of this list,of course. Finally,we set environ to point to this new list of pointers. Note from Figure 7.6 that if the original environment list was contained above the top of the stack,as is common,then we have moved this list of pointers to the heap. But most of the pointers in this list still point to name=value strings above the top of the stack.

b. If this isn’t the first time we’ve added new strings to the environment list,then we know that we’ve already allocated room for the list on the heap,so we just call realloc to allocate room for one more pointer. The pointer to the new name=value string is stored at the end of the list (on top of the previous null pointer),followed by a null pointer.

setjmp and longjmp Functions

In C,we can’t goto a label that’s in another function. Instead,we must use the setjmp and longjmp functions to perform this type of branching. As we’ll see,these two functions are useful for handling error conditions that occur in a deeply nested function call.

#include "apue.h"

#define TOK_ADD 5

void do_line(char *);
void cmd_add(void);
int get_token(void);

int main(int argc,char *argv[])
{
    char line[MAXLINE];

    while(fgets(line,MAXLINE,stdin)!=NULL)
        do_line(line);

    return 0;
}

char *tok_ptr;

void do_line(char *ptr)
{
    int cmd;

    tok_ptr=ptr;
    while((cmd=get_token())>0)
    {
        switch(cmd)
        {
            case TOK_ADD:
                cmd_add();
                break;
        }
    }
}

void cmd_add(void)
{
    int token;

    toke=get_token();

}

int get_token(void)
{

}

Typical program skeleton for command processing

The solution to this problem is to use a nonlocal goto: the setjmp and longjmp functions. The adjective ‘‘nonlocal’’ indicates that we’re not doing a normal C goto statement within a function; instead,we’re branching back through the call frames to a function that is in the call path of the current function.

#include <setjmp.h>
int setjmp(jmp_buf env);

void longjmp(jmp_buf env,int val);

We call setjmp from the location that we want to return to,which in this example is in the main function. In this case,setjmp returns 0 because we called it directly. In
the call to setjmp,the argument env is of the special type jmp_buf. This data type is some form of array that is capable of holding all the information required to restore the status of the stack to the state when we call longjmp. Normally,the env variable is a global variable,since we’ll need to reference it from another function.

When we encounter an error — say,in the cmd_add function — we call longjmp
with two arguments. The first is the same env that we used in a call to setjmp,and the second,val,is a nonzero value that becomes the return value from setjmp. The second argument allows us to use more than one longjmp for each setjmp.

#include "apue.h"
#include <setjmp.h>

#define TOK_ADD 5

jmp_buf jmpbuffer;

int main(int argc,char *argv[])
{
    char line[MAXLINE];

    if(setjmp(jmpbuffer)!=0)
        print("error");
    while(fgets(line,stdin)!=NULL)
        do_line(line);

    return 0;
}

void cmd_add(void)
{
    int token;

    token=get_token();
    if(token<0)
        longjmp(jmpbuffer,1);
}

Example of setjmp and longjmp

Automatic,Register,and Volatile Variables

Most implementations do not try to roll back these automatic variables and register variables,but the standards say only that their values are indeterminate. If you have an automatic variable that you don’t want rolled back,define it with the volatile attribute. Variables that are declared as global or static are left alone when longjmp is executed.

#include "apue.h"
#include <setjmp.h>

static void f1(int,int,int);
static void f2(void);

static jmp_buf jmpbuffer;
static int globval;

int main(void)
{
    int autoval;
    register int regival;
    volatile int volaval;
    static int statval;

    globval=1;autoval=2;regival=3;volaval=4;statval=5;

    if(setjmp(jmpbuffer)!=0)
    {
        printf("after longjmp:\n");
        printf("globval=%d,autoval=%d,regival=%d,"
            "volaval=%d,statval=%d\n",globval,autoval,regival,volaval,statval);
        return 0;
    }

    globval=95;autoval=96;regival=97;volaval=98;
    statval=99;

    f1(autoval,statval);
    return 0;
}

static void f1(int i,int j,int k,int l)
{
    printf("in f1():\n");
    printf("globval=%d,"
    " volaval=%d,j,k,l);
    f2();
}

static void f2(void)
{
    longjmp(jmpbuffer,1);
}

Effect of longjmp on various types of variables

Potential Problem with Automatic Variables

Having looked at the way stack frames are usually handled,it is worth looking at a
potential error in dealing with automatic variables. The basic rule is that an automatic variable can never be referenced after the function that declared it returns. Numerous warnings about this can be found throughout the UNIX System manuals.

#include <stdio.h>

FILE *open_data(void)
{
    FILE *fp;
    char databuf[BUFSIZ];

    if((fp=fopen("datafile","r"))==NULL)
    {
        return NULL;
    }

    if(setvbuf(fp,databuf,_IOLBF,BUFSIZ)!=0)
        return NULL;

    return fp;
}

Incorrect usage of an automatic variable

The problem is that when open_data returns,the space it used on the stack will be used by the stack frame for the next function that is called. But the standard I/O library will still be using that portion of memory for its stream buffer. Chaos is sure to result. To correct this problem,the array databuf needs to be allocated from global memory,either statically (static or extern) or dynamically (one of the alloc functions).

相关文章

用的openwrt路由器,家里宽带申请了动态公网ip,为了方便把2...
#!/bin/bashcommand1&command2&wait从Shell脚本并行...
1.先查出MAMP下面集成的PHP版本cd/Applications/MAMP/bin/ph...
1、先输入locale-a,查看一下现在已安装的语言2、若不存在如...
BashPerlTclsyntaxdiff1.进制数表示Languagebinaryoctalhexa...
正常安装了k8s后,使用kubect工具后接的命令不能直接tab补全...