[ Pobierz całość w formacie PDF ]
Implementation of a Computer Immune System for Intrusion- and
Virus Detection
Markus Christoph Unterleitner
office@unterleitner.info
February 13, 2006
2
Contents
1.
Introduction ........................................................................................................ 11
1.1 Strategies of intrusion detection systems (IDS)................................................................. 12
1.2 Overview about the thesis.................................................................................................. 12
2.
The Immune System of the Human Body ........................................................ 15
2.1 Architecture overview .......................................................................................................17
2.1.1 The innate immune system ......................................................................................... 17
2.1.2 The adaptive immune system...................................................................................... 18
2.2 The adaptive detection of pathogens ................................................................................. 19
2.3 The learning mechanism of the adaptive system ............................................................... 20
2.3.1 The B-Cells ................................................................................................................. 20
2.3.2 The T-Cells ................................................................................................................. 21
2.3.3 The costimulatory process .......................................................................................... 23
2.3.4 Antigen processing and presenting ............................................................................. 23
2.4 Detection failures made by the immune system ................................................................ 24
2.5 Summery about the tasks of immune system cells ............................................................ 25
3.
Modelling a Computer Immune System........................................................... 27
3.1 Related Work - UNM’s Computer Immune System ......................................................... 28
3.1.1 Overview of ARTIS .................................................................................................... 28
3.1.2 Architecture of ARTIS................................................................................................ 29
3.1.3 Using ARTIS for Network Intrusion Detection .......................................................... 30
3.1.4 Results from LISYS .................................................................................................... 30
3.1.5 Limitations of LISYS.................................................................................................. 31
3.2 Modelling a hybrid system architecture ............................................................................ 31
3.2.1 Defence mechanisms of the human immune system .................................................. 32
3.2.2 Defence mechanisms of the computer immune system .............................................. 33
3.2.2.1 The defence mechanisms on the first level........................................................... 33
3.2.2.2 The defence mechanisms on the second level ...................................................... 34
3.2.2.3 The defence mechanisms on the third level ......................................................... 35
3.2.2.4 The defence mechanisms on the fourth level ....................................................... 35
3.3 The representation of self and nonself............................................................................... 35
3.4 The detectors ..................................................................................................................... 36
3.5 Training the detection system............................................................................................ 37
3.5.1 Memory based detection ............................................................................................. 39
3.5.2 The costimulation signal ............................................................................................. 40
4.
Matching algorithms .......................................................................................... 43
4.1 Pearson Product-Moment Correlation Coefficient ............................................................ 43
4.2 Approximate string matching approaches ......................................................................... 44
4.2.1 Hamming distance ...................................................................................................... 44
4.2.2 Levenshtein or Edit Distance ...................................................................................... 46
4.2.3 R-Contiguous Symbols ............................................................................................... 47
4.2.4 Longest Common Subsequence .................................................................................. 49
5.
Implementing the Computer Immune System using SNORT........................ 51
5.1 Overview on the Snort Framework.................................................................................... 52
5.1.1 Snort’s Detection Engine ............................................................................................ 53
5.1.2 Snort’s Preprocessor ................................................................................................... 53
3
5.2 Architecture of the computer immune system................................................................... 53
5.3 Architecture differences to ARTIS and LISYS ................................................................. 54
5.4 The sensor of the computer immune system ..................................................................... 56
5.4.1 Configuration of the preprocessor............................................................................... 57
5.4.1.1 The preprocessor configuration file...................................................................... 57
5.4.1.2 The preprocessor data file .................................................................................... 58
5.4.2 The logfile of the preprocessor ................................................................................... 60
5.4.3 Applying the string match algorithms......................................................................... 60
5.4.3.1 Overview about the algorithm implementation .................................................... 60
5.4.3.2 Implementation of the Hamming Distance match algorithm ............................... 61
5.4.3.3 Implementation of the r-contiguous symbols match algorithm............................ 61
5.4.3.4 Implementation of the Levenshtein Distance match algorithm............................ 62
5.4.3.5 Implementation of the Longest Common Subsequence match algorithm............ 62
5.4.4 Using the Preprocessing Engine for payload processing ............................................ 63
5.4.4.1 Simulated network traffic ..................................................................................... 63
5.4.4.2 Measuring the success of a learning process ........................................................ 64
5.4.4.3 Overview of engine testing................................................................................... 64
5.4.4.3.1 The first approach, without the use of packet filters...................................... 65
5.4.4.3.2 The second approach, with the use of packet filters ...................................... 65
5.4.4.3.3 Factors that influence the training ................................................................. 66
5.4.4.4 Results of the first approach (without filters)....................................................... 66
5.4.4.4.1 The Pearson correlation coefficient ............................................................... 66
5.4.4.4.2 Hamming Distance ........................................................................................ 68
5.4.4.4.3 r-contiguous bytes.......................................................................................... 69
5.4.4.4.4 Levenshtein distance or edit distance ............................................................ 70
5.4.4.4.5 Longest common subsequence (LCS) ........................................................... 71
5.4.4.5 Results of the second approach (using filters)...................................................... 72
5.4.4.5.1 The Pearson correlation coefficient ............................................................... 73
5.4.4.5.2 Hamming Distance ........................................................................................ 73
5.4.4.5.3 r-contiguous bytes.......................................................................................... 74
5.4.4.5.4 Longest Common Subsequence (LCS).......................................................... 76
5.4.4.6 Conclusion on the training results of both approaches......................................... 77
5.4.5 Modification of the data representation ...................................................................... 77
5.5 Using the Preprocessing Engine for packet header processing ......................................... 78
5.5.1 Description of the data representation......................................................................... 78
5.5.2 Training of the detector set ......................................................................................... 79
5.5.3 Results of the applied algorithms................................................................................ 79
5.6 The monitor of the computer immune system ................................................................... 80
5.6.1 Periodically produced events ...................................................................................... 81
5.6.2 Events produced by detectors in the mature and memory state .................................. 81
5.6.3 Updates of the MYSQL-Database .............................................................................. 81
5.7 The computer immune system database ............................................................................ 82
5.8 Web-Frontend.................................................................................................................... 82
5.9 Incident Object Description and Exchange Format (IODEF) ........................................... 83
5.9.1 IODEF Overview ........................................................................................................ 83
5.9.2 The contents of the report in the computer immune system ....................................... 83
6.
Changing the data representation..................................................................... 85
6.1 Disassembling the payload data ........................................................................................ 86
6.2 The instruction spectrum of different file formats............................................................. 87
6.3 The effect of the entry point on disassembled instructions ............................................... 87
6.3.1 Overview of the problem ............................................................................................ 87
6.3.2 Investigation of PE executables and other file formats............................................... 88
6.3.2.1 Finding the synchronisation point ........................................................................ 89
4
6.3.2.2 Details of the investigation................................................................................... 89
6.4 Results of the investigation................................................................................................91
6.4.1 The position of the instruction synchronisation .......................................................... 91
6.4.2 Absolute equal instructions ......................................................................................... 92
6.4.3 Relative position of the RIS to the sliding window .................................................... 93
6.5 The reconstruction of the original instruction sequence .................................................... 94
6.6 Conclusion and further work ............................................................................................. 96
6.7 Implementing the payload disassembling function in the preprocessor ............................ 97
6.8 Conclusion on the results of the packet disassembling approach ...................................... 98
7.
Conclusion ........................................................................................................... 99
7.1 Problem of the useful detector symbol length ................................................................. 100
7.2 Problem of the nonself coverage ..................................................................................... 102
7.3 Solutions and suggested future work............................................................................... 102
Appendix A.................................................................................................................. 103
A.1 Text based file types ....................................................................................................... 103
A.1.1 Byte spectrum of HTM Pages .................................................................................. 104
A.1.2 Byte spectrum of CPP Source files .......................................................................... 104
A.1.3 Byte spectrum of C Source files............................................................................... 104
A.1.4 Byte spectrum of FRM Source files......................................................................... 105
A.1.5 Byte spectrum of TXT Files..................................................................................... 105
A.2 Binary based file types.................................................................................................... 106
A.2.1 Byte spectrum of PDF files ...................................................................................... 106
A.2.2 Byte spectrum of JPG Image files............................................................................ 106
A.2.3 Byte spectrum of ZIP Archive files ......................................................................... 107
A.2.4 Byte spectrum of MP3 Music files........................................................................... 107
A.3 Executable file types....................................................................................................... 107
A.3.1 Byte spectrum of EXE files...................................................................................... 108
A.3.2 Byte spectrum of DLL files...................................................................................... 108
A.4 Byte Spectrum of the implemented filters ...................................................................... 108
A.4.1 Text filter.................................................................................................................. 109
A.4.2 Binary filter .............................................................................................................. 109
Appendix B.................................................................................................................. 111
B.1 Results of the tested algorithms in chapter 5 .................................................................. 111
B.1.1 First approach, without filters .................................................................................. 111
B.1.1.1 Pearson correlation coefficient .......................................................................... 111
B.1.1.2 Hamming Distance ............................................................................................ 112
B.1.1.3 r-contiguous bytes.............................................................................................. 114
B.1.1.4 Levenshtein distance or edit distance ................................................................ 114
B.1.1.5 Longest common subsequence (LCS) ............................................................... 115
B.1.2 Second approach (using filters) ................................................................................ 116
B.1.2.1 The Pearson correlation coefficient ................................................................... 116
B.1.2.2 Hamming Distance ............................................................................................ 116
B.1.2.3 r-contiguous bytes.............................................................................................. 117
B.1.2.4 Longest common subsequence (LCS) ............................................................... 118
B.2 Results of the tested algorithm in chapter 6 .................................................................... 119
Appendix C.................................................................................................................. 121
C.1 Text based file types ....................................................................................................... 121
C.1.1 Instruction spectrum of CPP Source Files................................................................ 121
C.1.2 Instruction spectrum of TXT Files ........................................................................... 122
C.1.3 Instruction spectrum of HTM Pages ........................................................................ 122
5
[ Pobierz całość w formacie PDF ]
WÄ…tki
- zanotowane.pl
- doc.pisz.pl
- pdf.pisz.pl
- juli.keep.pl
ISBN-13 for Dummies Special Ed (ISBN - 0555023400), For Dummies E-Book Collection (Revised)
Ikony. Najpiękniejsze ikony w zbiorach polskich E-BOOK, Inne
Imię bestii. Tom 2. Odejście smoka Nik Pierumow e-book, Fantastyka, fantasy
Ian Rowlands - Full Facts Book of Cold Reading, Ultimate Magic eBooks Collection
Identity Violence Religion The Dilemmas of Modern Philosophy of Man - Anna Szklarska e-book, Nauka
Ilustrowany leksykon pisarzy i poetów polskich Monika Spławska-Murmyło E-BOOK, Literatura faktu
Ideał chrześcijanina w świetle pism Tertuliana Bieniek Monika E-BOOK, Inne
ISTNIENIE JEST BOGIEM JA JESTEM GRZECHEM PIOTR AUGUSTYNIAK E-BOOK, Nauka
Idealna para, E-BOOK, B(557), Baxter Mary Lynn(28)
Imagining Ajax - Simon Brown, ebook