Talk:Parchive (file format)

From Wikipedia, the free encyclopedia

Contents

[edit] Merge with PAR File

I vote yes merge them. I'm willing to do it, they're both small. Have PAR/PAR2 files redirect tho so we don't get dupes again. -- RevRagnarok 02:03, 15 May 2006 (UTC)

Beat ya to the merge. ;) Cwolfsheep 20:09, 3 June 2006 (UTC)
Cool, I was just looking it over, kudos! -- RevRagnarok Talk Contrib Reverts 16:23, 4 June 2006 (UTC)

[edit] Big rewrite

Please comment. Spent quite some time on it. -- RevRagnarok Talk Contrib Reverts 04:52, 10 June 2006 (UTC)

[edit] Other uses

Please contribute... I do what I mentioned with CD-R and DVD-R because I have had CD-Rs fade in the sun and get bit errors. The DAR stuff is just an extra precaution. — RevRagnarok Talk Contrib 05:56, 10 June 2006 (UTC)

Here is my Makefile. It will skip the subdirectory "Processed." I've used it on Linux and Cygwin:
OUT=RevRagnarok.par2
REDUNDANCY=10

.SILENT: all $(OUT)

SUBDIRS = $(shell find . -iwholename './Processed' -prune -o -type d -printf "''%P'Y'" | perl -pe 's/^\x27\x27\x27Y\x27//g;s/ /\\ /g;s/\x27\x27/ /g;s/\x27*Y\s*\x27/\/$(OUT) /g')

.PHONY: subdirs $(SUBDIRS)

HERE = $(shell pwd | perl -pe 's/ /\\ /g')

subdirs: $(SUBDIRS)

$(SUBDIRS):
        @$(MAKE) $(MAKEFLAGS) -C "$(patsubst %/$(OUT),%,$@)" -f $(HERE)/Makefile par2_files

par2_files : $(OUT)

$(OUT) : 
        $(if $(wildcard *.mp3), par2 c -v -r$(REDUNDANCY) $@ *.mp3, )

RevRagnarok Talk Contrib 03:47, 22 September 2006 (UTC)

[edit] Shouldn't be used for security

PAR does not use cryptographically secure checksums, and therefore cannot be used to assure edits have not occurred. Cryptographically secure checksumming algorithms such as SHA-1 should be used instead.
-- ArbitraryConstant 20:33, 21 December 2006 (UTC)

[edit] Simplified Layman's terms explanation

This article needs a simple layman's terms explanation. I wrote the following based on how I understand error correction to work, but I'm far from an expert. So I'm gonna post my proposed article text here and let someone else post it once an expert confirms the validity.


Error correction works sort of like a game of sudoku. If a single number is missing from any random location on nearly complete sudoku board, then any one of the relational number rules of sudoku provide all the extra information you need to correctly fill in that particular missing number box back to its intended value. However, as more and more numbers are missing you need to know more and more different sudoku rules in order to recreate the missing numbers.

So the original data to be kept corrected of errors is like the numbers you start with on an incomplete sudoku board, and the additional error correction parity files you use to correct errors in your data back to the original set are like the rules you use on a sudoku board to fill in the missing number squares back to their intended values.

If too many squares on your sudoku board are missing (i.e. there are too many errors in your data) then the rules of sudoku don't give you enough information to fill in the missing numbers (i.e. the parity files don't contain enough extra information to recreate the errors in your data). But if you knew additional sudoku rules (i.e. had more parity files) then you could recreate the correct set of numbers on your sudoku board (i.e. you could recreate the correct data in the portions which contain errors).

---

If that is approximately correct, then please add this text to the article! =) —Preceding unsigned comment added by 67.101.31.49 (talk) 08:41, 19 September 2007 (UTC)

Yeah, that explanation kind of works. ;) However, sudoku computations are likely a little more complicated to compute what is missing and you also know a priori what data is corrupted/missing so the analogy isn't close enough to go into the article (IMHO). It's more like picking up a "completed" game out of the trash and seeing two 8s in a single row and saying, "Well, I know that isn't right..." and then fixing it. — RevRagnarok Talk Contrib 23:39, 27 September 2007 (UTC)

---

I don't believe this article needs a "simple layman's terms". It should say that Reed-Solomon encoding is used and then forward people to the Reed-Solomon page. The "simple layman's terms" should go on the Reed-Solomon page.

Also, the Sudoku analogy is poor. It is best described using a set of integers. The first redundant block is the sum of all the integers. If any one integer is missing, the missing value is equal to the sum minus all the present values. The second redundant block is equal to the sum of the squares of all the integers. If two values, X and Y are missing, you can use the sums to find out X + Y and X^2 + Y^2 and given two equations with two unknowns, you can solve for X and Y. The third redundant block is the sum of all the cubes, the fourth redundant block is the sum of all the integers to the fourth power, etc. The equations get more complicated, but their all solvable.

Reed-Solomon encoding is just that. The only difference is that it is done using a different "addition" and "multiplication" operation. Adding two 32-bit unsigned integers can generate a 33-bit value. And multiplying two 32-bit unsigned integers can generate a 64-bit value. The addition and multiplication used in Reed-Solomon are on a Galois Field and have the special property that adding or multiplying two 32-bit values always generates a 32-bit value.

Anyway, this "layman's terms" description should go on the Reed-Solomon page. Par1 is Reed-Solomon but the Par2 file format is flexible enough to support other error correcting codes, such as Tornado Codes, LT Codes, or Online Codes.

- Michael Nahas (designer of the Par2 file format). —Preceding unsigned comment added by Mdnahas (talk • contribs) 23:12, 13 May 2008 (UTC)