reuse.copyright module

Utilities related to the parsing and storing of copyright notices.

class reuse.copyright.FourDigitString

A string that is four digits long.

alias of str

reuse.copyright.YearRangeSeparator

A range separator between two years.

alias of Literal[’–’, ‘–’, ‘-‘]

reuse.copyright.is_four_digits(value: str) FourDigitString | Literal[False][source]

Identify a string as a four-digit string. Return the string as FourDigitString if it is one.

>>> is_four_digits("1234")
'1234'
>>> is_four_digits("abcd")
False
>>> is_four_digits("12345")
False
reuse.copyright.validate_four_digits(value: str) FourDigitString[source]

Validate whether a given string is a FourDigitString.

>>> validate_four_digits("1234")
'1234'
>>> validate_four_digits("abcd")
Traceback (most recent call last):
    ...
ValueError: 'abcd' is not a four-digit year.
>>> validate_four_digits("12345")
Traceback (most recent call last):
    ...
ValueError: '12345' is not a four-digit year.
Raises:

ValueError – The string is not four digits.

reuse.copyright.YEAR_RANGE_PATTERN = re.compile('(?P<start>\\d{4})(?:(?:(?P<separator_nonspaced>(?:--|–|-))(?P<end_nonspaced>\\S+)?)|(?:\\s+(?P<separator_spaced>(?:--|–|-))\\s+(?P<end_spaced>\\d{4})))?')

A regex pattern to match e.g. ‘2017-2020’.

reuse.copyright.COPYRIGHT_NOTICE_PATTERN = re.compile('(?P<prefix>(SPDX-(File|Snippet)CopyrightText:(\\s*((©|\\([Cc]\\))|(Copyright((\\s*(©|\\([Cc]\\)))|(?=\\s)))))?|(Copyright((\\s*(©|\\([Cc]\\)))|(?=\\s)))|©))\\s*(?P<text>.*?)\\s*')

A regex pattern to match a complete and valid REUSE copyright notice.

class reuse.copyright.CopyrightPrefix(value)[source]

Bases: Enum

The prefix used for a copyright notice.

SPDX = 'SPDX-FileCopyrightText:'
SPDX_C = 'SPDX-FileCopyrightText: (C)'
SPDX_SYMBOL = 'SPDX-FileCopyrightText: ©'
SPDX_STRING = 'SPDX-FileCopyrightText: Copyright'
SPDX_STRING_C = 'SPDX-FileCopyrightText: Copyright (C)'
SPDX_STRING_SYMBOL = 'SPDX-FileCopyrightText: Copyright ©'
SNIPPET = 'SPDX-SnippetCopyrightText:'
SNIPPET_C = 'SPDX-SnippetCopyrightText: (C)'
SNIPPET_SYMBOL = 'SPDX-SnippetCopyrightText: ©'
SNIPPET_STRING = 'SPDX-SnippetCopyrightText: Copyright'
SNIPPET_STRING_C = 'SPDX-SnippetCopyrightText: Copyright (C)'
SNIPPET_STRING_SYMBOL = 'SPDX-SnippetCopyrightText: Copyright ©'
STRING = 'Copyright'
STRING_C = 'Copyright (C)'
STRING_SYMBOL = 'Copyright ©'
SYMBOL = '©'
static lowercase_name(name: str) str[source]

Given an uppercase NAME, return name. Underscores are converted to dashes.

>>> CopyrightPrefix.lowercase_name("SPDX_STRING")
'spdx-string'
static uppercase_name(name: str) str[source]

Given a lowercase name, return NAME. Dashes are converted to underscores.

>>> CopyrightPrefix.uppercase_name("spdx-string")
'SPDX_STRING'
class reuse.copyright.YearRange(start: FourDigitString, separator: Literal['--', '–', '-'] | None = None, end: FourDigitString | str | None = None)[source]

Bases: object

Represents a year range, such as ‘2017-2025’, or ‘2017’. This only represents a single range; multiple separated ranges should be put in a collection (typically a tuple).

start: FourDigitString

The first year in the range. If it is only a single year, this is the only relevant value.

separator: Literal['--', '–', '-'] | None = None

The separator between start and end. If no value for end is provided, a range into infinity is implied, and end becomes an empty string.

end: FourDigitString | str | None = None

The second year in the range. This can also be a word like ‘Present’. This is bad practice, but still supported.

original: str | None = None

If parsed from a string, this contains the original string.

classmethod from_string(value: str) YearRange[source]

Create a YearRange object from a string.

Raises:

YearRangeParseError – The string is not a valid year range.

classmethod tuple_from_string(value: str) tuple[YearRange, ...][source]

Create a tuple of YearRange objects from a string containing multiple ranges.

Raises:

YearRangeParseError – The substring is not a valid year range.

to_string(original: bool = False) str[source]

Converts the internal representation of the date range into a string. If original is True, original is returned if it exists.

If start and end are provided without separator, - will be used as default separator in the output.

This method is identical to calling str() on this object, provided original is False.

classmethod compact(ranges: Iterable[YearRange]) tuple[YearRange, ...][source]

Given an iterable of YearRange, compact them such that a new more concise list is returne without losing information. This process also sorts the ranges, such that ranges with lower starts come before ranges with higher starts.

  • Consecutive years (e.g. 2017, 2018, 2019) are turned into a single range (2017-2019).

  • Two consecutive years (e.g. 2017, 2018) are NOT turned turned into a single range.

  • Consecutive ranges (e.g. 2017-2019, 2020-2022) are turned into a single range (2017-2022).

  • Overlapping ranges (e.g. 2017-2022, 2019-2021) are turned into a single range (2017-2022).

  • Repeated ranges are removed.

  • Ranges with non-year ends (e.g. 2017-Present, 2020-Present) are only turned into a single range with ranges that have identical ends (2017-Present).

class reuse.copyright.CopyrightNotice(name: str, prefix: ~reuse.copyright.CopyrightPrefix = CopyrightPrefix.SPDX, years: tuple[~reuse.copyright.YearRange, ...] = <factory>)[source]

Bases: object

Represents a single copyright notice.

name: str

The copyright holder. Strictly, this is all text in the copyright notice which is not part of years.

prefix: CopyrightPrefix = 'SPDX-FileCopyrightText:'

The prefix with which the copyright statement begins.

years: tuple[YearRange, ...]

The dates associated with the copyright notice.

original: str | None = None

If parsed from a string, this contains the original string.

classmethod from_string(value: str) CopyrightNotice[source]

Create a CopyrightNotice object from a string.

Raises:

CopyrightNoticeParseError – The string is not a valid copyright notice.

classmethod from_match(value: Match) CopyrightNotice[source]

Create a CopyrightNotice object from a regular expression match using the COPYRIGHT_NOTICE_PATTERN re.Pattern.

classmethod merge(copyright_notices: Iterable[CopyrightNotice]) set[CopyrightNotice][source]

Given an iterable of CopyrightNotice, merge all notices which have the same name. The years are compacted, and from the CopyrightPrefix prefixes in copyright_notices, the most common is chosen. If there is a tie in frequency, choose the one which appears first in the enum.

to_string(original: bool = False) str[source]

Converts the internal representation of the copyright notice into a string. If original is True, original is returned if it exists.

This method is identical to calling str() on this object, provided original is False.

class reuse.copyright.SpdxExpression(text: dataclasses.InitVar[str])[source]

Bases: object

A simple dataclass that contains an SPDX License Expression.

Use SpdxExpression.__str__() to get a string representation of the expression.

text: dataclasses.InitVar[str]

A string representing an SPDX License Expression. It may be invalid.

property is_valid: bool

If text is a valid SPDX License Expression, this property is True.

To be ‘valid’, it has to follow the grammar and syntax of the SPDX specification. The licenses and exceptions need not appear on the license list.

property licenses: list[str]

Return a list of licenses used in the expression, in order of appearance, without duplicates.

If the expression is invalid, the list contains a single item text.

classmethod combine(spdx_expressions: Iterable[SpdxExpression]) SpdxExpression[source]

Combine the spdx_expressions into a single SpdxExpression, joined by AND operators.

simplify() SpdxExpression[source]

If the expression is valid, return a new SpdxExpression which is ‘simplified’, meaning that boolean operators are collapsed. ‘MIT OR MIT’ simplifies to ‘MIT’, and so forth.

If the expression is not valid, self is returned.

class reuse.copyright.SourceType(value)[source]

Bases: Enum

An enumeration representing the types of sources for license information.

DOT_LICENSE = 'dot-license'

A .license file containing license information.

FILE_HEADER = 'file-header'

A file header containing license information.

DEP5 = 'dep5'

A .reuse/dep5 file containing license information.

REUSE_TOML = 'reuse-toml'

A REUSE.toml file containing license information.

class reuse.copyright.ReuseInfo(*, spdx_expressions: set[~reuse.copyright.SpdxExpression] = <factory>, copyright_notices: set[~reuse.copyright.CopyrightNotice] = <factory>, contributor_lines: set[str] = <factory>, path: str | None = None, source_path: str | None = None, source_type: ~reuse.copyright.SourceType | None = None)[source]

Bases: object

Simple dataclass holding licensing and copyright information

spdx_expressions: set[SpdxExpression]
copyright_notices: set[CopyrightNotice]
contributor_lines: set[str]
path: str | None = None
source_path: str | None = None
source_type: SourceType | None = None
copy(**kwargs: Any) ReuseInfo[source]

Return a copy of ReuseInfo, replacing the values of attributes with the values from kwargs.

union(*other: ReuseInfo) ReuseInfo[source]

Return a new instance of ReuseInfo where all set attributes are equal to the union of the set in self and the set(s) in other.

All non-set attributes are set to their values in self.

>>> one = ReuseInfo(copyright_notices={CopyrightNotice("Jane Doe")},
...           source_path="foo.py")
>>> two = ReuseInfo(copyright_notices={CopyrightNotice("John Doe")},
...           source_path="bar.py")
>>> result = one.union(two)
>>> print([notice.name for notice in sorted(result.copyright_notices)])
['Jane Doe', 'John Doe']
>>> print(result.source_path)
foo.py

Either spdx_expressions or copyright_notices is non-empty.

One of spdx_expressions or copyright_notices is non-empty.

contains_info() bool[source]

Any field except path, source_path and source_type is non-empty.